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Abstract - We investigate the I-topology and the rl-topology on the state spaces of a 
C*-subalgcbra of Mat(n, C) which are defined in terms of convergence with respect to the 
relative entropy. In quantum information geometry there are a Pythagorean theorem and 
a projection theorem, valid for an exponential family. We achieve the completion of these 
theorems to the rl-closure of the exponential family. The completion to the norm closure 
^-H is not possible since this can be strictly larger than the rl-closure. 

The complete projection theorem proves the existence of an rl-projection, a projection 
with linear fibers from the whole state space onto the rl-closure of the exponential family. 
^ The rl-projection allows to study the entropy distance (the infimum of the relative entropy) 

O of a state from the exponential family. We discuss the non-commutative feature of a dis- 

^ continuous entropy distance and we prove two necessary conditions for local maximizers 

00 of the entropy distance. The complete Pythagorean theorem solves a problem in quantum 

information theory: Maximization of the von Neumann entropy under linear constraints. 
We provide previously unknown solutions without any support restrictions. The solution 
Q_i set is the rl-closure of the well-known exponential family of Gibbs ensembles. 

(— I Index Terms - non-commutative algebra, relative entropy, information topology, expo- 

^ nential family, convex support, Pythagorean theorem, projection theorem, von Neumann 

^ entropy. 
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Introduction 



The Pythagorean theorem and projection theorem in information geometry make 
statements about the distance of a probability measure from a family of probability 
measures, see §3 in [ANOO] by Amari and Nagaoka and §1.C in [CM03] by Csiszar 
and Matus. They provide a geometric frame for large deviation theory or maximum- 
likelihood estimation. While information geometry is often confined to families 
of mutually absolutely continuous probability measures, some theorems have been 
extended [Ba78, Ce82, CM03, CM05] using the I-/rI-convergence^ defined in terms of 
the relative entropy. This convergence is explained in §1.1. In quantum information 
theory, see e.g. [ANOO, Be09, BZ06, Holl, 1097, NCOO, PcOS], there is also a relative 
entropy, the Umegaki relative entropy, and one can ask the analogue questions as in 
classical probabihty theory. 

We report in §1.4 on the l-/rl-convergence on the state space of a C*-subalgcbra 
of Mat(n, C) and about the associated topology, namely the l-/rl-topology. The 
1-topology includes the rl-topology and both include the norm topology. The l-/rl- 
topology has properties of a metric topology, e.g. its open sets are unions of disks 
arising from the relative entropy. But it is quite distinct from the norm topology, e.g. 
it is not second countable (unless the algebra is commutative) and the state space 
splits into the connected components of its faces with respect to the I-topology. 

The usefulness of the l-/rl-topology in quantum statistics and quantum hypoth- 
esis testing is not yet clarified. In contrast to classical probability theory there is 
no canonical choice of a computational basis and the consequences for measurement 
and observation will have to be taken into consideration. We have found the general 
result that the infimum of the relative entropy of a state from a set of states does 
not decrease if a topological closure operation is applied to the set. This property 
can be used e.g. in the Sanov theorem of quantum hypothesis testing [BS05]. 

We have found several applications of the rl-topology in quantum information 
geometry and quantum information theory. The Pythagorean theorem and pro- 
jection theorem in information geometry are valid for the relative entropy and for 
exponential families (we recall them in §1.3). In §1.5 we present their extensions to 
a complete Pythagorean theorem and to a complete projection theorem. 

^Here and in the sequel "I" stands for "information" and "rl" for "reverse information". 
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The complete projection theorem imphes an rl-projection with hnear fibers, de- 
fined for all states, onto the rl-closure of an exponential family. The rl-projection 
allows to compute the entropy distance from an exponential family. An impor- 
tant example of entropy distance is the stochastic interaction measure of multi- 
information, see the references in [KWll]. This interaction measure is interesting 
in statistical physics where phase coexistence generically appears in highly corre- 
lated systems. Maximization of the entropy distance from an exponential family is 
advocated in [Ay02] as a structuring principle in natural systems. In §4.3 we prove 
two necessary conditions for its local maximizers. One of the condition is an upper 
bound on the rank, it enforces a certain degree of determinism on a local maximizer. 

We present in §1.5 an application of the complete Pythagorean theorem. This 
allows us to compute on the state space of a C*-subalgebra of Mat(n,C) all maxi- 
mizers of the von Neumann entropy under linear constraints. To our best knowledge, 
the non-invertible solutions were unknown previously, partial answers were found by 
Wichmann [Wi63]. An innovation in our analysis is to study non-exposed faces^ of 
projected state spaces using Griinbaum's notion of poonem. The notion of poonem 
is equivalent to the notion of face and is defined in §2.4 by sequences of consecutively 
exposed faces, also known as access sequences [CM05]. 

In few words, the I-/rI-topology behaves much like a metric topology and at the 
same time it has still capacity to respect the convex geometry of the state space. 
The rl-topology has many applications. Further research will show if shorter proofs 
exist for the complete projection theorem, e.g. using a parallelogram-like identity 
as in Theorem 1 in [CM03]. Conversely, our results may be helpful to discover new 
identities or inequalities in quantum information theory. 

1.1 Information convergence in classical probability theory 

We recall some key properties of information convergence in probability theory. A 
comprehensive historical account is described in §I.C in [CM03]. 

Let M.he a set of probabihty measures on a measurable space {X, X). If P, Q e 
Ai are absolutely continuous with respect to a cr-finite measure A and p{x) resp. 
q{x) is the Radon-Nikodym derivative of P resp. Q, then the relative entropy is 

D{P\\Q) = J^p{x)log^^^dX. (1) 

This equals zero if and only ii P = Q and otherwise D{P\\Q) is strictly positive or 
+00. Given a sequence {Pn)ne'M C Ai and a probability measure P E Ai we have, 
according to [CM03], I-convergence resp. rl- convergence of (P„)neN to P if 

\imn^^D(Pn\\P)^0 resp. lim„^oo ^(^'l l^'n) = . (2) 

Csiszar has studied these convergences in the much wider context of /-divergence. 
He has proved in Theorem 3 in [Cs67] that the information neighborhoods 

{Q e M \ D{Q\\P) < e} resp. {Q e M \ D{P\\Q) < e} {PeM,e>0) 

(3) 

^Non-exposed faces were ignored in the erroneous statement of Theorem 1 e) in [Wi63]. 
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do not define a base of a topology if (X, X) = (N, 2^). 

At a later time Dudley and Harremoes [Du98, Ha07] have studied topologies on 
Ai that are defined by sequential convergence (2). In a different approach, Csiszar 
[Cs67] considers as a Frechet (V)-space (a generalization of topological space, 
see §1 in [Si52]) arising from the family of information neighborhoods (3). 

We show in §3 that the two analogous approaches in a C*-subalgebra of Mat(n, C) 
are compatible: The Frechet (V)-space of information neighborhoods is a topological 
space and its topology corresponds to the convergence. 

1.2 Representation of states and mean values 

We shall work throughout with a C*-subalgebra A of Mat(n, C), i.e. a norm-closed 
self-adjoint subalgebra A of Mat(n, C). 

A state on ^ is a complex hnear functional / : ^ — )■ C, such that f{a*a) > for 
all a e ^ and /(I) = 1. Bratteli and Robinson provide in Theorem 2.4.21 in [Br87] 
a proof of the one-to-one correspondence between states / on ^ and matrices p >: 
of trace one in A 

f{a) = {a,p) (a G A). 

The matrix representation p of / is called density matrix in quantum mechanics and 
we will use the terms of state and density matrix synonymously. The state space is 

Sa = {pe^|p^O,tr(p) = l}. 

Definition 1.1. 1. We denote the identity in Mat(n, C) by 1„, the identity in A 
by 1. By we denote the real vector space of self-adjoint matrices in A. Let 
a & A. The spectrum of a is 

spec_4(a) :={AGC|a — Alis not invertible in A}, 

its elements are the spectral values of a in A. The matrix a is positive semi- definite 
if a e and if a has no negative spectral values, we then write a ^ 0. If a ^ 0, 
then there exists b E A, b >z with a — 6^, see e.g. §2.2 in [Mu90]. The matrix b 
is unique and one defines y/a :— b. We have a*a >z and put \a\ :— y/a*a. 

2. The standard trace tr turns Mat(n,C) into a complex Hilbert space with the 
Hilbert- Schmidt inner product (a, 6) := tr(a6*) for a,b E Mat(n, C) and we use 
the two-norm \\a\\2 ■— (a, a). We consider also the spectral norm \\a\\, which is 
the square root of the largest eigenvalue of a* a and the trace norm \\a\\i := tr |a|. 
The topology of any norm is the norm topology and convergence of a sequence 
(aj)igis} C .A to a e ^ in any norm will be denoted by limj^oo Oj = a. The three 
norms restrict to ^sa and we consider (^sa, (") ")) a Euclidean vector space with 
the Hilbert-Schmidt inner product. 

3. In a Euclidean vector space (E, (-, ■)) we denote the two-norm by ||a;||2 := \/ {x, x) 
and write x A- y : <^=^ (x, y) — for x, y e E. For subsets X,Y C M we write 
X L Y : ^ {x,y) ^ O^x e X,y e Y and := {y e E \ X ± y^x e X} {ii 
z E K then z Y : <^=^ {z} _L Y and z-^ := {z}^). If A is a non-empty affine 
subspace of E, then we denote the translation vector space of A by lin(A). We 
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denote the orthogonal projection from E onto A by tta : E — )■ A. This mapping 
is characterized by the equations x — Tr^ix) -L hn(A) for all a; e E. 

4. The mean value set of a linear subspace U C is the orthogonal projection of 
the state space onto U 

M{U) := M^{U) = M^a) C U. (4) 

The mean value mapping is defined ior ui, . . . ,Uk & Asa by 

: Aa K'', a ^ {{ui,a), . . . ,{uk,a)) . (5) 

The convex support of ui, . . . ,Uk G Asa is 

cs{ui,...,Uk) := csa{ui,. . . ,Uk) = {mu^,...,uk{p) \ P ^ <Sa} C R'^ . (6) 

Conditional probability measures and their formal generalization in a C*-subal- 
gebra of Mat(n, C) are correctly rendered by spectral values (and not by eigenvalues). 
These concepts will be used in §4.2 and thereafter. 

Remark 1.2. 1. We denote the probability simplex of a non-empty (at most) count- 
able set fl by 

P(Q) := {p = (pj^en e [0, 1]" | E.^nPc. = 1} ■ (7) 

The elements of F{Q) are called probability vectors on Q. In the sequel we shall 
identify the probability vectors p e F{il) with the probability measures P on 
{n, 2^) using P{A) := Y.u,&aP>^ for >1 C Q. If Q = {1, . . . , N} is finite, we shall 
write ¥{N) := P({1, . . . , AT}). 

2. Let A C Mat(n, C) be C*-isomorphic to C^, e.g. A can be the set of diagonal 

matrices if iV = n. If Ci, . . . , Cat denotes the standard ONB of C'^, then every 
state p e 5_4 = S^N defines a probability vector by setting p^^ := tr(pe;^) for 
a; e {1, . . . , A^} and we have a one-to-one correspondence P(A^) = Sj^. 

3. We use spectral values to describe conditional probability distributions. If A C 
{1, . . . , iV} then we have the natural inclusion of P(A) C P(iV). If P G P(iV) and 
P{A) > 0, then the conditional probability measure P{-\A) e F{A) is 

P{{i}\A) P{{i})/P{A), (ieA). 

With the identifications in 2 we consider P{-\A) G C C Mat(n, C) in 
three algebras. If ^4 C {1, . . . , iV}, then is a spectral value of P{-\A) in and 
a spectral value (and eigenvalue) in Mat(n, C). On the other hand the spectral 
values of P{-\A) in C"^ give us the conditional probabilities of i E A. 

4. Barndorff-Nielsen [Ba78] has introduced the concept of convex support in prob- 
ability theory, it was studied in [Ce82] and was refined in [CM03, CMOS] to 
investigate mean values and exponential families. For a finite measurable space 
their definition reduces to ours with A = C^. 
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5. The convex support introduces coordinates on the mean value set M(f/). If 
Ui, . . . ,Uk G and U := spanjpj('Ui, . . . ,Uk), then the convex bodies M(?7) = 
cs('Ui, . . . ,Uk) are "affinely isomorphic" (see Remark 1.1.1 in [Well]): The mean 
value mapping restricts to the bijection 

mn^,...,uk \m{u) ■ M{U) cs(wi, . . . , life) = {mui,...,uk{p) I P ^ 5} 

such that mui^,,,^uk°'^u = '^ui,...,uk- We prefer the Hilbert-Schmidt geometry in Asa, 
to the coordinates in R'^ and perform most of our analysis in M(C/). Exceptions 
are Corollary 4.16, Theorem 4.26 and Lemma 5.1. 

1.3 The Pythagorean theorem and the projection theorem 

Two theorems in quantum information geometry will be extended in §4 from the 
invertible states to the whole state space. Here we recapitulate these theorems with 
instructions to the easy proofs. 

Definition 1.3. The relative entropy oi p e S from a e S is S{p,a) — +oo unless 
Im(p) C Im((7) and then 

S{p,a) := trp(log(p) - log((7)) . 
The logarithm can be defined by functional calculus, see Remark 2.4.4. 

The relative entropy satisfies S{p,a) > for all p,a e S with equality if and 
only if p = (J, see e.g. §11.3 in [Pe08] or §11.3 of [NCOO] (This result is sometimes 
cited as Klein's inequality). Convexity and continuity properties are recalled in §2.2. 

The Pythagorean theorem of relative entropy, see e.g. §3.4 in [Pc08j, applies to 
states p,a,T & S where a, r are invertible and p — a 1. log(r) — log(c7) (using the 
Hilbert-Schmidt inner product). We have 

S{p,a) + S{a,T) = S{p,T). (8) 

With relative entropy replaced by squared distance, this equation reminds us of 
the Pj^hagorean theorem in Euclidean geometry. Classical counterparts of the 
Pythagorean theorems (8) and (9) and of the projection theorem (11) are discussed 
e.g. in §3.4 in [ANOO]. 

Definition 1.4. We use the real analytic function^ i?^ : ^ Ass,, 

R{e) = R^iO) := exp^(^^)/tr(exp^(^^)), 

the exponential exp_4 is defined by functional calculus in the algebra A, see Defini- 
tion 2.3.3. For a non-empty real affine subspace C we define an exponential 
family in A 

e := i?^(e) = {it:^(^)|^ee}. 

We call a one-dimensional exponential family (with or without parametrization) 
e- geodesic. We use the translation vector space U :— lin(©) — Q — Q. 
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Figure 1: The Staff elherg family is sketched by e-geodesics (thin curves). Closure 
components in different topologies are indicated (bold). Since the Euclidean straight 
line from a to p intersects the e-geodesic from a to r (red) orthogonally with respect 
to the BKM-metric, the Pythagorean theorem S{p, a) + S{a, r) = S{p, r) holds. 

In the literature, e.g. in §3.4 in [Pe08], the curve t t— i?^oa(t) is called e-geodesic, 
if a : M — 7- O is an affine map. 

Example 1.5. We shall use the Pauli a-matrices ai := (ij), cr2 '■= (iV) ^'^'^ 
cTs := (o i^i) and for b = (6i,&2,&3) G 1^'^ we abbreviate ba := biai + + b^a^. 
The Staff elherg family, studied in [KWll], is the exponential family 

i?(spanR((Ti © 0, 0-2 © 1)) 

in the algebra Mat(2, C) © C. This exponential family is depicted in Figure 1. 

The Pythagorean theorem (8) applies to exponential families. We have for states 
p E S and a,T E S, such that p — a U , 

S{p,a) + S{a,r) = S{p,r). (9) 

The condition p — a -L U means that the Euclidean straight line from a to p is 
perpendicular to the exponential family S with respect to the BKM-Riemannian 
metric, see Remark 4.2. This is indicated by the right angle in Figure 1. 

The projection theorem, is now an easy corollary. For every state p E S + Lf-^ 
the intersection (p + f/-*-) fl £ contains a unique state 7i£{p) and a projection to S is 
defined by 

ne : {S + U^)nS ^ £ , p ^ TXe{p) (10) 
The entropy distance (12) of p from £ is 

d^(p) = 5(p,7r^(p)). (11) 

Indeed, the intersection has at least one point. If two states a, r are in the inter- 
section, then p — cr ± f/ and p — r ± f/. Equality a = t follows if we add the two 
corresponding Pythagorean equations (9). 

^On first reading tliis section A := Mat(n,C) and exp^(a) := X^i^o'^V^' for a G A can be 
assumed. If A is C*-subalgebra of Mat(n, C) we have exp_^(a) = 1 + 1*/*' where 1 can differ 
from the identity 1„ of Mat(n,C), see Definition 1.1.1. 
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1.4 New results on the I- and rl-topology in a matrix algebra 

We discuss similarities of the I-/rI-topology with a metric topology. Other interest- 
ing properties of the I- /rl-topology, e.g. its relation to convex geometry, are post- 
poned to §3.2. Since the two topologies have a lot in common, we use a prefix 
variable a; e {I, rl} for 

cu-closure, cj-topology, etc. 

Unless otherwise specified we will use the norm topology. 

The infimum of the relative entropy of a state from a set of states is important 
in quantum information theory [CM03, BS05]. We choose this as our starting point. 

Definition 1.6. Let p,a & S. We use the short-hand notations 

S\p,a) := S{a,p) and S'\p,a) := S{p,a) . 
The infimum over X G S is denoted by 

S'-ip,X) := inUex S^ip,r). 

The function 

dx{p) S'^'ip^X) = inf,ex^(p,T) (12) 
is called entropy distance in [KWll]. 

The conditions (13) and (14) imply a large part of our topological results. The 
Pinsker-Csiszdr inequality, see e.g. §3.4 in [PeOS], states that for p, cr e 5 we have 

2S{p,a) > \\p-<j\\l (13) 

Here the trace norm from Definition 1.1.2 is used. We prove in Proposition 3.18 for 
all p,a & S and ((Jj)jgpj C S the continuity result of 

]imi^^S^{a,ai) = =^ ]imi^^ S^{p,ai) = S^{p,a). (14) 

This statement means that the relative entropy is continuous in the first argument 
for the I-topology and in the second argument for the rl-topology (see Definition 1.9 
and Remark 2.6.2). 

Definition 1.7. The u -closure of X C <S is cr{X) := {p e 5 | S'^ip, X)^0}. 

We will see in Theorem 3.20 that taking cj-closures is allowed under the infimum: 

Theorem. For all p e S and X (Z S we have S''{p,X) = S''{p,cr{X)). 

The analogue of the a;-closure is well-known in probabihty theory [CMOS], how- 
ever it can strictly decrease the infimum as Example 3.11 demonstrates. We turn 

to topology. If {A{i)}ifzf^ is a sequence of statements, then we shall say that A(i) is 
true for large i if there is e N such that A{i) holds for all i > N. 
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Definition 1.8. We define a family of subsets of the state space S by 

j-uj lu CS I ^ ^' (^»)»eN C S and limi^^ S'^ip, Pi) = 
1 =^ Pi G U for large i 

The open ui-disk about p & S with radius e e (0, oo] is 

V^{p,e) {ae5|5-(p,a)<e} (15) 
and the closed ou-disk about p & S with radius e e (0, oo] is 

W^-(p,e) := {aeS\S-{p,a)<e}. (16) 

We denote the norm topology on <S by T"" and for a e ^ and {ai)i^n C ^ we denote 
by limj_^oo cbi — a the convergence in norm. 

The family is clearly a topology on S and the inclusion T"" C follows 
already from the Pinsker-Csiszar inequality (13). We prove in Theorem 3.21.4 the 
following. 

Theorem. The inclusions T"" C 7^^ C hold and we have for all sequences 
(pi)igN C S and p e S 

limj^oo S{pi,p) =0 =^ limj^oo S (p, pi) = =^ hmj^oo Pi = P ■ 

Definition 1.9. We call the u-topology on S. 

We recall topological concepts used in the following discussion. 

Definition 1.10. Let {X,T) be a topological space. An open set U G T is a 
neighborhood oi x & X ii x & U and T is a Hausdorff topology if distinct points of X 
have disjoint neighborhoods. A family B C T is a base for {X, T) if any non-empty 
open subset of X can be represented as the union of a subfamily of B. A family 
of neighborhoods B{x) of x is called a base for (X, T) at x if for any neighborhood 
V of X there exists U E B{x) such that U G V. The topological space (X, T) is 
first- countable if at every point x & X there exists a countable base and (X, T) is 
second-countable if (X, T) has a countable base. 

The above inclusion T"^^ C will be proved using the open cj-disks {V'^{p, e) | 
e e (0, oo]} which are a base of T'^ at p e 5 (this follows from (14) and from general 
results on convergences). The fact that these w-disks are a base implies also the 
equivalence b) below. Equivalence a) is proved in Theorem 3.21.1 (it is equivalent 

to c(r'") = c"). 

Theorem. For p & S and sequences {pi)i^^ <Z S we have 

a) 

limj-^oo S^{pi pi) = ■i=f' Vt/ e T'^ with p & U we have Pi & U for large i 

4=^ Ve e (0, oo] we have pi e V'^{p^ e) for large i . 
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Remark 1.11. Csiszar's work cited in §1.1 give us a first idea about infinite dimen- 
sional algebras. In our context of algebraic formalism, this corresponds to the von 
Neumann algebra of bounded sequences /°° := {x = (,Xj)jgM G | supj^p^ \xi\ < oo} 
acting by multiplication on the Hilbert space := {x G /°° | Ixi]"^ < oo} of 

square summable sequences. The space :— {x e l°° \ ^^gp^ \xi\ < oo} of absolutely 
summable sequences contains the probability simplex 

P(N) = {p= (p,).eN e [0, 1]^ I E^eNPi = 1} 

defined in (7). The trace is the linear functional tr : ^ C, x ^ X^igN-^* ^® 
have the equality of^ 

P(N) = Sio. = {x el^\ tr(x) = 1, X > 0} . 

The information neighborhoods (2) do not define a topology on P(N) = »S;oo hence 
the equivalence b) in the above Theorem is wrong for A = 

1.5 New results on the rl-closure of an exponential family 

We use the rl-topology on the state space 5_4 of a C*-subalgebra A of Mat(n, C) to 
maximize the von Neumann entropy under linear constraints, including previously 
unknown solutions. This result follows from extensions of the Pythagorean theorem 
and the projection theorem in information geometry, which we also present in this 
section. The complete projection theorem enables us to study local maximizers and 
non-commutative features of the entropy distance in §4.3 and §4.4. 

Definition 1.12. The von Neumann entropy of p e <S is 

S{p) := -trplog(p), 

the logarithm is defined by functional calculus in Remark 2.4.4. The free energy of 
9 e Aa is 

F{e) = F^ie) := logtrexp^(^^), 

the exponential function exp^ and = exp^(^)/ trexp_4(^) are introduced in 

Definition 1.4. 

Maximization of the von Neumann entropy under linear constraints is a funda- 
mental problem in quantum statistical mechanics, see e.g. [1097, Ru99, Pe08, ANOO]. 
Given self-adjoint matrices Ui, . . . ,Uk G Asa. it is well-known that for suitably chosen 
^ = {Ci, ■ ■ ■ ,Ck) G K*^ there exist inverse temperatures ^i, . . . , /^^ G M, such that 

PiO = RA{-T.llP^Ui) (17) 

■''The probability simplex P(N) = Si^a corresponds to the normal states on (see e.g. Theorem 
2.4.21 in [Br87]) and the dual space of continuous linear maps ^ C is strictly larger than Sioo. 
It can be represented by bounded additive measures which are not necessarily a-additive (see e.g. 
p. 89 in [WeOO] and p. 296 in [DS58]). 
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uniquely maximizes the von Neumann entropy among all states p G S_a with mean 
values {ui, p) = for i = 1, . . . ,k. The inverse temperatures f3i, . . . , f3k can be 
computed from the conditions, j = 1, . . . , k 

and the von Neumann entropy is 

sipiO) = ^^(-EliA^^O + EliA^- 

All invertible maximizers have the form (17) and they form the exponential family 
£ :— Ra{U) of Gibbs ensembles for the vector space U :— span-g^{ui, . . . ,Uk)- But 
there exist maximizers p & S which are not invertible. 

We show in Theorem 4.26 that the complete set of solutions is the rl-closure 
cY^{S) = {p e 5^ I infaee S{p,a) = 0}. The explicit solutions depend on a lattice 
V'^ of orthogonal projections, which in principle can be computed by spectral anal- 
ysis. We use the mean value map mui,.,.,uk '■ P ^ (('^i) p)t ■ ■ ■> {'^ki p)) S'Hd the convex 
support cs^(iii, . . . , life) = {m„i,...,„j^(p) | p e 5^} C M!". 

Theorem. For every mean value tuple ^ e cs^(iii, . . . ,Mfc) there is a unique maxi- 
mizer p({) of the von Neumann entropy among all states p G 5^ with mean values 
'iTT'ui,... ,uk{p) = There exists a unique projection p G and there exist inverse 
temperatures . . . , G M such that 

iwi'---^Wl>Fp^pi-T!l=iPiVUiP) = -i- 

For each solution . . . , /3fc) we have 

and p{^) has the von Neumann entropy 

FpAp{- E^Li l^iPUip) + E!=i ACi ■ 

The map ^ — )■ p(^) is real analytic for invertible solutions p(^) but can be dis- 
continuous on the whole, if the algebra A is non-commutative, see Remark 4.22.3. 
An example is the Staffelberg family. Its rl-closure is described in Example 1.13 and 
drawn in Figure 1. This exponential family is drawn in Figure 4 together with its 
mean value set M([/) = 7rj/((S_4). 

The above theorem will be proved by the following theorems which extend the 
Pythagorean theorem and the projection theorem in §1.3. Using notation from 
Definition 1.4, we consider an exponential family S and the entropy distance — >■ 
[0, oo), d£{p) = mia^sS{p, a). 

Theorem (Complete projection theorem). Let p & S be an arbitrary state. 

1. The rl-closure cY^{S) intersects p-\- JJ-^ in a unique state denoted by 7r£{p). 

2. The relative entropy S{p, •) has a unique local minimum on cV^{S) and the entropy 
distance (12) equals ds{p) = '^'^cr&cY'^{e) S{p-iCf) = S{p,Tr£{p)). 
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The complete projection theorem is proved in Theorem 4.15. The state 7r£-(p) is 
called the rl-projection of p to S. The theorem implies for p & S and r ^ T^sip) in 
the rl-closure cF^(£) that 

S{p,T)-d,{p) > 0. 

This inequality is not obvious a priori since the infimum in de{p) is over S (and not 
over cY^{S)). The exact difference is proved in Theorem 4.25: 

Theorem (Complete Pythagorean theorem). If p ^ and a e cV^{S), then 
S{p,T^e{p)) + S{'Ks{p),(t) = S{p,a). 

The main construction towards the complete projection theorem will be an ex- 
tension ext(£') of the exponential family S, defined in terms of the mean value set. 
We prove in a C*-subalgebra A of Mat(n, C) that 

cr\£) = ext(£:) C £ (18) 

holds with the rl-closure cV^{S) and the norm closure S. A strict inclusion is possible: 

Example 1.13. The Staffelherg family 8 from Example 1.5 satisfies cY^{£) C £. 
This exponential family is depicted in Figure 1: The norm closure £ is the union 
of £ with the circle about £ (bold) and the closed upright segment (dashed). The 
rl-closure cY^{£) is strictly included in £, the upright segment is missing except for 
its top end (bold point). See [KWll] for this analysis. 

An analogue extension ext(,B) of exponential famihes B of Borel probability mea- 
sure on R'' satisfies 

cY\B) C H c ext(B), (19) 

and contrasts (18). Sec the introduction and Lemma 6 in [CM05] for these state- 
ments. The rl-closure cY^{B) is defined using the r I- convergence (2), B is the closure 
in the variation distance and ext(;B) is defined using the concept of convex core, 
which generalizes convex support (6). If ^ = is commutative, then 

cY\£) = ext(£:) ^£^ cY\B) = ext(B) = B 

holds, the correspondence being Remark 1.2.2. Further conditions for a commutative 
algebra are discussed in §4.4. 

2 Analysis on the state space of a matrix algebra 

This section contains preliminary material mainly by citation from the literature 
with the exception of perturbation theoretic proofs in §2.3. In §2.4 we cite an 
algebraic formulation of the convex geometry of the state space and of its projections 
to linear subspaces. 
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2.1 Projections and functional calculus 

We define functional calculus for normal matrices. This section is a bit technical 
because we have to work in subalgebras of Mat(n,C) not containing 1,„ in order to 
treat conditional probability measures and their formal matrix analogs. The partial 
ordering on Asa. and its restriction to the lattice of projections will be needed in §2.3. 
It plays a major role in §2.4 and §4.2-§4.4. A general reference on lattice theory is 
[Bi73]. 

Definition 2.1. 1. A map / : X — > F between two partially ordered sets (X, <) 
and {Y, <) is isotone if for all x,y E X such that x < y we have f{x) < f{y)- A 
lattice is a partially ordered set {C, <) where the infimum x Ay and supremum 
X V y of each two elements x,y E C exist. A lattice isomorphism is a bijection 
between two lattices that preserves the lattice structure. A lattice C is complete 
if for an arbitrary subset S G C the infimum /\ S and the supremum \/ S exist. 
The least element /\ £. and the greatest element \/ £. in & complete lattice £ are 
improper elements of C, all other elements of C are proper elements. An atom of 
a complete lattice C is an element x E x ^ f\C, such that y < x and y ^ x 
implies y — /\ C for all y & C. 

2. An element p e ^ is a projection if p* — p — p^. The projection lattice of the 
algebra A is 

V = {peA\p'^^p*^p}. (20) 

We use the partial ordering ^ on ^ defined by a ^ 6 for a, 6 e .4 if and only if 
b — a is positive semi-definite. We use the partial order on the projection lattice 
V, which is the restriction of ^. For every projection p E V the compressed 
algebra pAp is defined. 

3. Let a G ^ be a normal matrix, i.e. a*a = aa*. Let N G N, {q}^^ C C 
be mutually distinct numbers and let {pj}^^ T^A be a family of non-zero 
projections such that for i,j — 1,. . . ,N we have PiPj — PiSij, where Sij = 
unless i — j with Su — 1. If X]i=iPj — ^ 

« = EjIiCiPi, (21) 

then the sum (21) is called spectral form of a in A, {pi}fLi is a spectral family for 
a in ^ and its members are spectral projections of a in A. Let us denote the set 
of eigenvalues of a by spec(a) := spec^j^t^^ q(a). 

Remcirk 2.2. 1. It is a classical result of linear algebra, see e.g. §§79-80 in [Ha87], 
that a normal matrix a e Mat(n, C) has a unique spectral form a = X^A6spec(a) '^P>^ 
in a G Mat(n, C). Moreover, there exist polynomials {/A}AGspcc(a) in one variable 
and with complex coefficients, such that px = f\{a) for A G spec(a). 

2. Let A C Mat(n, C) be a C*-subalgebra of Mat(n, C) with identity 1 and a G .4 a 
normal matrix. If a = SAespec(a) '^^'^ is the spectral form of a in Mat(n, C) then 
it is easy to show that 

a = EAespec^(a) ^(IPa) 
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is the unique spectral form of a in A. Either spec_4(a) = spec(a) or spec^(a) C 
spec_4(a) U {0} = spec(a). For aU non-zero A e spec^(a) we have Ipx — px. 

Some special projections and functional calculus will be needed. 

Definition 2.3. 1. If a e ^ is a normal matrix then we denote the spectral projec- 
tions of a by p^{a) — Pj^{a) for A e spec^(a). The support projection of a, also 
called support of a, is s{a) :— X]A6spec^(a)\{o}^''^(^)- '^^^ kernel projection of a in 
A is k^{a) := 1 — s{a). 

2. If a is self-adjoint, then the maximum of spec_4(a) is denoted by A"*" (a) = Aj(a) 
and the corresponding spectral projection in A is denoted by p'^{a) — and 
is called the maximal projection of a in A. 

3. If a complex valued function / is defined on the spectrum of a normal matrix 
a e A, then /(a) = /^(a) := EAespcc^(a) said to be defined by 
functional calculus in \i p & V we abbreviate functional calculus in pAp 
by /'^'(a) = /_4^(a) := fpApio) provided that the complex valued function / is 
defined on the spectrum speCp^p(a) of a normal matrix a e pAp. 

Remark 2.4. 1. If a G ^ is a normal matrix and / : C — )■ C is defined on spec_4(a) 
and on specMat(n,c)(«)> then we have /^(a) = 1/Mat(n,c)(a)- For example, we get 
ior a,u & A (the integral is defined in components of matrix entries) 

||t=oexp_4(a + tM) = exp_4((l -|/)a)Mexp_4(|/a)dy, (22) 

by multiplication with \ E A from the analogue equation in Mat(n, C). The 
latter can be proved by polynomial expansion [Li73]. 

2. By Remark 2.2.2, the support projection s(a) does not depend on the algebra 
A C Mat(n,C) that contains a normal matrix a. But kj^{a) and p\{a) (if a is 
self-adjoint) do depend on A. 

3. The term log[(^'°)](l, 0) = (0,0) is an example of functional calculus in the com- 
pressed algebra C © {0} of while log(l, 0) is undefined. 

4. The relative entropy introduced in Definition 1.3 is understood as the function 
such that for p,a & S we have S{p, a) :— oo unless s{p) ^ s{a) where 

S{p,a) := trp(logW'')l(p)-log[^('^)l(a)). 

Similarly, the von Neumann entropy of p e <S introduced in Definition 1.12 is 
S{p) :— — trplog'''^''^'(p). By part 2 these definitions restrict from Mat(n, C) to 
any C*-subalgebra of Mat(n, C). 

5. The projection lattice V with the partial ordering ^ is a complete lattice. For 
this and the following two statements see e.g. Remark 2.6 in [Well] or [ASOl]. 
For a self-adjoint matrix a e we have 

a — pap <J=> pa — a <^=> s{a) ^ p . (23) 

Hence the ordering for projections p,q eV simplifies to p ^ q «^=^ pq = p. 
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2.2 Continuity and convexity of the relative entropy 

We recall convexity and continuity properties (in the norm topology) of the relative 
entropy, introduced in Definition 1.3. 

Definition 2.5. 1. A function / : X — > (—00,00] defined on a convex subset X 
of a finite-dimensional Euclidean vector space E is convex if for x, y e X and 
A e [0, 1] we have 

f{{l-X)x + Xy) < {l-X)f(x) + Xf{y). 

A finite function / : X — >■ R is strictly convex if for x,y & X, x ^ y and A e (0, 1) 
we have 

f{{l-X)x + Xy) < (l-X)fix) + Xf{y). 
If / is (strictly) convex, we say that — / is (strictly) concave. 

2. If {X, d) is a metric space and / : X — )■ (—00, 00] then / is lower semi- continuous 
if for all a; e X and every sequence {xi)i^^ C X converging to x we have 

liminfi^oo /(a^i) > f{x). 

3. If (^, d) is a metric space and f : X ^ (—00,00] then / is lower continuous if 
for all X e X we have 

lim,\^oinf{/(|/) | d{x,y) < e} = f{x) . 

Remark 2.6. 1. The relative entropy S* : 5 x 5 — > [0, 00] is a (norm) lower semi- 
continuous and convex function, see e.g. §111. B in [We78]. Under these assump- 
tions the Corollary of Lemma 17.4 in [Ce82] proves that the relative entropy is 
(norm) lower continuous on S x S. 

2. The relative entropy is discontinuous in the norm topology in its first argument 
already for the algebra ^ = of a bit and in the second argument for the algebra 
A — Mat(2, C) of a qbit, see Example 3.19. However, in Proposition 3.18 we show 
that the relative entropy is continuous in the I-topology in its first argument and 
continuous in the rl-topology in its second argument. 

The lower continuity of the relative entropy will be used in Theorem 3.21.5 to 
study the rl-topology. 

2.3 Two perturbative statements 

We provide arguments from the perturbation theory of Mat(n, C), which will be 
used to characterize the rl-convergence and to study exponential families. 

We denote the set of eigenvalues of a G Mat(n,C) by spec(a) — specj^a^^^ q (a) 
and we shall write ^ in place of for scalars C ^ C. 

Definition 2.7. 1. The resolvent set of a matrix a e Mat(n, C) is the complement 
of the spectrum res(a) := C \ spec(a). 
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2. The resolvent of a e Mat(n, C) is defined for ( e res(a) by (a — () ^. 

3. The second resolvent equation for a, 6 e Mat(n, C) and ( G res(a) fl res(6) is 

(a - C)-^ -{b- = (a - 0-\b -a){b- . (24) 

Remcirk 2.8. 1. If a, 6 e Mat(n, C) are self-adjoint matrices, let X\{a) , . . . , Xj^{a) 
denote the eigenvalues of a arranged in decreasing order and counting multiplic- 
ities. Weyl's perturbation theorem, proved e.g. in §111.2 in [Bh97], states that 

max^^JAt(a)-At(6)| < ||a - 6|| . (25) 

Here the spectral norm from Definition 1.1.2 is used. 

2. According to Problem 5.7 on page 40 in [Ka95], if ( belongs to res(a) for a normal 
matrix a e Mat(n, C), then the resolvent of a is bounded by 

||(a-C)"i < dist(C,spec(a))-i (26) 
where dist(2;, M) := inf{|2; - m\ \ m e M} for z e C and M C C. 

3. Given a normal matrix a e Mat(n, C), let F C res(a) be a positively oriented 
circular curve of radius r > 0. It is well-known, see e.g. Chapter 2 §1.4 in [Ka95], 
that 

Pr{a) lA^ - Cy'dC (27) 

is the sum of all spectral projections p^(a) of a in Mat(n, C), such that A lies 
inside F. 

4. Let a,b E Mat(n, C) be self-adjoint matrices and let Fa be disjoint circular 
curves of radius r > centered at A G spec(6). If ||6 — a\\ < r, then by 
Weyl's perturbation theorem (25) every eigenvalue of a lies in exactly one of 
the circles {FA}Aespec(fe)- The projections Q^{a) := Prx{C') (27) are defined and 
In — ^xQ^ia) holds (with summation over the eigenvalues A G spec(6) of b). 
The second resolvent equation (24) and the inequality (26) imply for A G spec(6) 

\\Q\a)-p\b)\\ < J^Jib - 0-\b - a){a - Q-'WdC < ^(^#^ • (28) 

Hence for fixed 6, if ||6 — a|| — > then Q^{a) converges in spectral norm to p^{b). 

The next proposition will characterize the r I- convergence in Proposition 3.18. 

Lemma 2.9. Let p,a E S and (ri)igM C S such that s{p) ^ s((t) ^ s{Ti) holds for 
all i G N. Then limj^oo S{a, Ti) = implies limj_^oo S{p, Ti) = S{p, a). 

Proof: By the Pinsker-Csiszar inequality (13) the sequence (Tj)igp} converges to 
a in norm. We view Tj as a perturbation of a and take a sufficiently small circle 
F of radius r > about G C. Then, for large i G N the projection PriTi) in 
(27) is defined and satisfies k{Ti) ^ Pr{Ti) where k{Ti) = kMat{n,C){'Ti) is the kernel 
projection. Then two projections pi, qi & A are defined by pi :— Pr{Ti) — k{ri) and 
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Qi := 1„ — Pr(Tj), they satisfy Qi + Pi — s{Ti). We think of pi as the neghgible 
contribution to s(rj). 

According to Definition 2.3.3 we spht the functional calculus into two compressed 
algebras PiApi and QiAqi, 

S{a, n) ^ -S{a)-tra log^^^ (p,r,) - tr a log''''] (g.r,) . 

We have Tj ''■^^ a, by (28) we have qj ''■^^ s{a) and the spectral values of qjTjqj 
in qjAqj are strictly larger than r > 0, hence the term \og^'^'\qiTi) log'^*^'^-'^ (a) 

converges. Using the assumption S{a,Ti) ^-^^ gives limj^ootr(7log^*^(piri) = 0. 

Now we use a monotonicity argument. It is clear that p/X^[p) -< s{p) holds and 
by assumption we have s{p) ^ s{a). If A > is the smallest non-zero eigenvalue of 
a then Xs{a) ^ a. Hence ^ W(pjP — ^- ■^'-'^ all i e N we have log'^*' (pjTj) < 
hence 

= limi_^ootr(7log[^'*l(piTi) < ^ limi_^oo tr p logt**'] (p^r^) < 

proves limj^oo trplog'^''(pjri) = 0. Now 

S{p,Ti) = -5(p) -trplog[^'^l(p,r,) -trplog[«^l(g,Ti) 
-5(p)-0-trplog[^(-)](<7) = S{p,a) 

completes the proof. □ 

The following statement is used in Proposition 4.3 to set up the mean value 
chart of an exponential family and in Lemma 4.12 to study rl-closures of exponential 
families. Part 1 is used implicitly in Lemma 7 in [Wi63]. 

Lemma 2.10. 1. Let {xj)ji=m C Asa. \ {0} such that lim^^oo ll^^jll = oo. We assume 
there exist u,a E Asa such that limj_!.oo = u o,nd limj^oo exp_4(a;j) = a. Then 
spec_4(M) C (—00,0] and s(a) ^ k^{u). 

2. Let 9,u & Asa. such that spec^(M) C (— oo,0]. Then 

lim^^+oo exp_4(6' + iit) = exp[J-^^"^'(A;^(u)6'A;^(it)) . 

Proof: The strategy in the first part is to consider y.j := as perturbations of u 
and to estimate spectral values of e^'^ in suitable compressed subalgebras. We choose 
disjoint circular curves F;^ in the complex plane about the eigenvalues A e spec(«). 
Using Weyl's perturbation theorem (25), the projections in (27) 

Q\xj) := Pr^ixj) = PrM) 

are defined for large j. Let A G spcc(?7,) and A 7^ 0. The projection X j ^ IS cl 
sum of spectral projections of xj in Mat(n, C) for non-zero eigenvalues of xj, so 
Q^{xj) G w4 by Remark 2.2.2. We consider functional calculus in the compressed 
algebra Q^{xj)AQ^(xj), 

h^{xj) := exp^f^"''^\Q^{xj)xj) = Q^{xj) exp{xj) . 
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The spectral values of the self-adjoint matrix Q''^{xj)yj in Q'^{xj)AQ'^{xj) converge 
for J — )■ oo to A 7^ because there is only one eigenvalue of u in the circle F^. Since 

-II II 

Xj — UjWxjW we have for A < and for large j the bound < es"^-'". Then 

\\Xj\\ oo imphes h [Xj) U. (29) 

If A > then the analogous arguments show that the spectral norm > 

elll^^H diverges to +oo. 

For A 7^ the projection Q^[xj) converges to p^{u) by (28). Hence with summa- 
tion over A e spec(M) \ {0} we have s{u) — lim^^oo Sa^o Q^i^j)- Now the assumed 

convergence of exp_4(a;j) a gives 

s{u)a = limj^oo EA5^o'5^(a^i)exp(xj) = lim^^oo ZIa^o ^^(^j) = 
and spec(M) C (— oo,0]. Then (23) and the equation 

kj\,{u)a = (1 — s{u))a — a 

show s{a) :< kjs,{u). 

We prove convergence and calculate the limit in the second statement. For small 
real parameter c > let Xc '■= u + c9, then Xc '-^^ u. For A e spec(M) U{0} we choose 
disjoint circular curves F;^ in the complex plane about each such A and we define 

Q\xc) := Pro(2^c). 

For all A < the argument in (29) shows Q^{xc) exp{xc) '^-^ 0. Since = 
EA6spec(«)u{0} Q^i^c) holds for large j we have 

limt^+oo exp(6' tti) = limc^o exp(^Xc) = limc-^oQ^{xc) exp{Q°{xc)lxc) ■ 

By (28) we have Q^{xc) k{u) G Mat(n, C). The first order expansion is calculated 
in Chapter II §1 equation (1.17) in [Ka95]: With Q := 2^ j^^{u - C)"^^(ti - C)~^dC 
we have^ 

Q^{xc) = k{u) + cQ + o{c) . 

We compute 

Q''{xc)\x, = \Q\x,)x,Q\x,) = k{u)ek{u) + o{l) 
and the continuity of the exponential gives 

limt^+00 exp(6' tw) = limc^.oexp((5°(a;c)^a;c) = k{u) exp{k{u)6k{u)) . 
Multiphcation of this formula with the identity 1 of ^ completes the proof. □ 

^If g is a positive function and / is any function (here with values in Mat(n, C)), then / = o{g) 
means - — J- and o{g) is called Landau symbol. 
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2.4 Lattices of faces and projections 

In this section we settle definitions of convex geometry. For convenience we provide 
a selected overview of the algebraic description of the convex geometry of the mean 
value set Wl{U) = 7ru{S), defined in (4) as the orthogonal projection of the state 
space S onto a linear subspace U C ^sa- 

We need two distinct notions of " face" of a convex set, each defining a lattice of 
subsets ordered by inclusion. We begin with a general convex set. 

Definition 2.11. Let (E, (■, ■)) be a finite-dimensional Euclidean vector space. 

1. The closed segment between x,y is [x,y] := {(1 — X)x + Ay | A G [0, 1]}, the 
open segment is ]x,y[:= {(1 — X)x + Ay | A e (0, 1)}. A subset C C E is convex 
ifx,y e C ^ [x,y] C C. 

2. Let C be a convex subset of E. A face of C is a convex subset F of C, such 
that whenever for x,y & C the open segment ]x,y[ intersects F, then the closed 
segment [x,y] is included in F. li x E C and {x} is a face, then x is called an 
extreme point. The set of faces of C will be denoted by jF{C), called the face 
lattice of C. 

3. The support function of a convex subset C C E is defined by E — >■ R U {±oo}, 
u !->■ /i(C, u) :— sup^^q{u, x). For non-zero w e E the set 



is an affine hyperplane unless it is empty, which can happen if C = or if C is 
unbounded in w-direction. If C fl H{C, u) ^ 0, then we call H{C, u) a supporting 
hyperplane of C. The exposed face of C by m is 



and we put F±(C, 0) := C. The faces and C are exposed faces of C by definition. 
The set of exposed faces of C will be denoted by J-j_{C), called the exposed face 
lattice of C. A face of C, which is not an exposed face is a non- exposed face and 
we then say the face F is not exposed. If an extreme point of C defines an exposed 
face then it is an exposed point. Otherwise the extreme point is a non-eocposed 
point. 

4. Some topology is needed. Let X C E be an arbitrary subset. The affine hull 
of X, denoted by aff(X), is the smallest affine subspace of E that contains X. 
The interior of X with respect to the relative topology of a,S{X) is the relative 
interior n{X) of X. The complement rb(X) := X\ri(X) is the relative boundary 
of X. If C C E is a non-empty convex subset then we consider the vector space 
lin(C) = {x — y\ x,yE aff(C)}. We define the dimension dim{C) := dim(lin(C)) 
and dim(0) := -1. 

Remcirk 2.12. 1. As observed e.g. in [KWll, Wel2], the mean value set M{U) can 
have non-exposed faces even though all faces of S are exposed. An example is 
shown in Figure 2. 



H{C,u) := {x eE: {u,x) = h{C,u)} 



F^{C,u) := Cr\H{C,u) 
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Figure 2: This clove shape is the mean value set of the Swallow family. The support- 
ing hyperplane to the left defines a one-dimensional exposed face. Two non-exposed 
points are indicated by small circles. The supporting hyperplane to the right defines 
an exposed point. 

2. Let C C E be a convex subset. Different to Rockafellar or Schneider [Ro72, Sc93] 
we always include and C to J^±{C) so that this set is a lattice. The inclusion 
J^±{C) C J-{C) is easy to show and there are various ways to see that J-'±{C) 
and J^{C) are complete lattices ordered by inclusion where the infimum is the 
intersection, see e.g. §1.1 in [Wel2] or §2.1 in [Well]. The convex set C admits 
by Theorem 18.2 in [Ro72] a partition into relative interiors of its faces 

C = Ui.e^(c)ri(F). (30) 

In particular, every proper face of C is included in the relative boundary of C 
and its dimension is strictly smaller than the dimension of C. 

We recall the algebraic description of the face lattice T{S_a) = J^±{'Sa) of the 
state space Sa- 

Definition 2.13. Extreme points of S are called pure states. For every orthogonal 
projection p e Va we set 

Hp) = ^a{p) SpAp 

and we denote the face lattice of the state space hy T — Ta '■— ^{Sa). 

Proposition 2.14 (Proposition 2.9 in [Well]). The state space S is a convex body 
of dimension dim(.4.sa) — 1? the affine hull is aff(iS) = Ai, the translation vector 
space is lin(iS) = Aq and the relative interior consists of all invertible states. The 
support function at a & Asa is the maximal spectral value h{S,a) = A+(a) of a. If 
a e is non-zero, then the exposed face of a is the state space F±{S, a) = F(p) of 
the compressed algebra pAp, where p — p'^{a) is the maximal projection of a. 

Corollary 2.15 (Corollary 2.10 in [Well]). All faces of the state space S are ex- 
posed. The mapping ¥ :V ^ p ^ F(p) is an isomorphism of complete lattices. 

Renicirk 2.16. It follows from Corollary 2.15, Proposition 2.14 and (23) that every 
face of S can be written as F(p) = {p G 5 | s{p) ^ p} for some p e V and the 
relative interior is riF(p) — {p & S \ s{p) — p}. 

Let us turn to the mean value set M{U) — nu{S) defined in (4), where U C 
is a linear subspace. A lifting construction connects to the isomorphism ¥ : V 
This leads to algebraic descriptions of the two face lattices J-j_(M(t/)) C J-'{M.{U)). 
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Definition 2.17. We define for subsets C C the (set- valued) lift by 

L^iC) = L^C) := Sj,n{C + U^). 
We define tlie lifted face lattice 

Cy = £5 := {L^(F) I Fe J-(M(C/))} 
and tlie lifted exposed face lattice 

£[/,± ^ _ {L^{F) I Fe J1(M(C/))}. 

Lemma 2.18 (§5 in [Wel2]). The lift L restricts to the bijection J'(M([/)) 

and to the bijection J-'±{M.{U)) — ^ £^'^. These are isomorphisms of complete 
lattices with inverse ttu- For u & U we have 7ru[F^{S,u)] — F±{M.{U),u) and 
L^ [F^{M{U),u)]^F^{S,u). 

The results give rise to useful lattice isomorphisms, if we use appropriately de- 
fined lattices of projections. 

Definition 2.19. The projection lattice resp. exposed projection lattice of U is 

= pu _ F-i(£^) resp. P^'^ = V^'^ ¥-\C'i^). (31) 

Corollary 2.15 and Lemma 2.18 imply two lattice isomorphisms defined for suit- 
able projections p by p i->- 7r[7(F(p)): 

J^{M{U)) resp. V^'^ J^±{M{U)) (32) 

between and the face lattice of the mean value set resp. between V^''^ and the 
exposed face lattice. Lemma 2.18 characterizes the lifted exposed face lattice by 

jr,u,± ^ {F^{S, u)\ueU}u{&}. 

The algebraic description in Proposition 2.14 of faces F±{S, u) of the state space S 
translates therefore to the exposed faces of the mean value set M{U): 

Corollary 2.20. The exposed projection lattice is "P^'"*" = {Pjt,{u) \ u E U} U {0}. 

In order to understand the non-exposed faces of M(t/) algebraically, we have to 
look at sequences of faces. 

Definition 2.21. 1. Let C be a convex subset C of the finite-dimensional Euclidean 
vector space (E, (-, ■)). We call a finite sequence Fq, . . . ,F^ C C an access se- 
quence (of faces) for C ii Fq — C and if Fj+i is a properly included exposed face 
of Fj for i = 0, . . . , m — 1, 

Fo 2 Fi D ■■■ D Fm. (33) 

2. For p eV and a e the orthogonal projection ^sa — > (p^p)sa is 

cP{a) := 7r(pAp)M = pap ■ (34) 
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Figure 3: A poonem constructed by repeated inclusion of exposed faces. 

3. We call a finite sequence • • • )Pm C an access sequence (of projections) for 
[/ if Po = 1 and if p^+i belongs to the exposed projection lattice V^^^^^'^ for 
i = 0, . . . , m — 1 and such that (pj >- p^+i : <^=^ >: p^+i and pj 7^ Pi+i) 

Po pi :^ ••• ^ Pm- 

Grunbaum [Gr03] defines a poonem as an element of an access sequence of faces. 
An example is depicted in Figure 3. In finite dimensions the notion of poonem is 
equivalent to the notion of face, see e.g. §1.2.1 in [Wel2]. 

Theorem 2.22 (§3.2 in [Well]). The lattice isomorphism T{M{U)) in (32) 

extends to a bijection from the set of access sequences of projections for U to the set 
of access sequences of faces for M.{U) by assigning 

{po,...,Pm) ^ (7r[/(F(po)),...,7r[/(F(p„))). 

For convenience we cite further result from §3.2 in [Well]. 

Lemma 2.23. If p E V is a projection, then c^{U) 7r[/((p«4p)sa) is a real linear 
isomorphism and the following diagrams commute. 

{pApU^^M{P^P)s.) F(p) ^^7rj;(F(p)) ri(F(p))^^ri(7rc;(F(p))) 




c^U) Mp^^(cf([/)) vi{M^j,^{&{U))) 

Corollary 2.24. A projection p e V belongs to the projection lattice if and only 
if p belongs to an access sequence of projections for U . 

Corollary 2.25. For each two projections p,q & such that p ^ q there exists an 
access sequence for U including p and q. 

Remark 2.26. Corollary 2.24 implies a computation method for V'^: One has to 
compute the maximal projection (see Definition 2.3.2) of all elements of U, then the 
maximal projections of elements of c^(C/) for each previously calculated projection 
p and so on (see Remark 3.10 and §3.3 in [Well]). 

Lemma 2.27. If p e S, then p e ri(F(p) ) + [/-*- holds for a unique projection p e . 
We have p = f\{q e \ s{p) ^ q}. 
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3 Information topologies on the state space 

We study the I-/rI-topology on the state space of a C*-subalgebra of Mat(n, C). 
Our analysis in §3.2 is based on the idea of divergence function and L*-convergence 
that we recall and customize in §3.1. Unless otherwise specified we will use the norm 
topology. 

3.1 The topology of a divergence function 

We generalize the idea of metric space to the notion of divergence function on a set. A 
topology is associated in the abstract setting of an axiomatic notion of convergence 
of countable sequences. Conditions on the divergence function, available for the 
relative entropy on Mat(n,C), imply quite strong results. Let X be any set. 

Definition 3.1 (L*-convergence'^). A relation C C x X between sequences and 
members of X is a convergence on X. If ((a;„)„gN, x) E C then we write Xn — > x and 
we say {xn)nef^ C-converges to x and x is the C-limit of The convergence 

C is a sequential convergence on X, if 

a) Xn = X for all n implies a;„ x, 

b) Xn — > X and (|/n)neN is a subsequence of {xn)nevi then y„ x. 

If C is a sequential convergence on X, then [X, C) is a sequential space. A sequential 
convergence C on X is an L-convergence if 

c) Xn — >■ X and x„ y implies x — y. 

The L-convergence C on X is an L*-convergence and {X, C) is an L*-space if 

d) Xn -/^ X (i.e. it is false that Xn — )■ x) implies the existence of a subsequence 
(2/n)neN of {xn)neN, such that for any subsequence {zn)neN of (yn)neN we have 

Zn ) X. 

We consider the family T(C) of subsets t/ C X such that x E U and Xn — > x imply 
Xn G U for large n. 

Remark 3.2 (The topology of a convergence). It is well-known [Du64] that T{C) 
is a topology on X if C is a convergence on X. Moreover, if y C X is T(C) closed 

then {iin)n&% C Y and |/„ — > y imply y G Y. Important for our purpose is: If b) 
above holds, then the converse is also true, Y C X is T{C) closed if and only if 

{yn)nm C y and yn — >y imply y eY. 

The important example of a metric space will be generalized in Definition 3.14. 

Example 3.3 (Metric spaces). Let {X,d) be a metric space for d : X x X — >■ R. 

Then x„ — ^ x : <(=^ limj^oo d{x, Xi) = defines an L*-convergence on X and 
the disks B{x,e) := G X | d{x,y) < e} for e > define a base for T{Cd) at 
X & X. The topology T{Cd) is known as the metric topology. 

^The following definition of an L*-space is used in [En89, Du64]. An L*-space in the sense of 
[Be63] uses only the axioms a), b) and d). 
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We can also go the opposite way from a topology to a convergence. 

Definition 3.4 (The convergence of a topology). If {X,T) is a topological space 
then the convergence C(T) is defined for sequences {xi)i^^ C X and x & X by 

(xjjjgN — > X : <(=^ ii X E U £ T then Xi E U for large i. 

For any topological space {X,T) the reader should check that T C T{C{T)) 
holds. Similarly, if C is a convergence on X, then C C C{T{C)) holds. An equality 
condition was proved by Kisyhski, see e.g. Theorem 2.1 in [Du64]: 

Theorem 3.5. // (X, C) is an L*-space, then C(r(C)) = C. 

Continuity can be expressed in terms of convergence or of topology. 

Definition 3.6. Let f : X ^ X' he a function between arbitrary sets. If {X,C), 
{X', C) are sequential spaces, then / is continuous for C and C" at a; G X if 

f{xn) — f{x) whenever x„ x. The function / is continuous for C and C if / 
is continuous for C and Cat every x G X. If T resp. T' is a topology on X resp. 
X', then / is continuous for T and T' if f~^{U') is T open for every T' open set 
U' C X'. 

The following proposition is an excerpt of Theorem 2.2 in [Du64]. 
Theorem 3.7. Let (X, C) and (X', C) he sequential spaces and / : X — >■ X'. 

L If f is continuous for C and C , then f is continuous for T{C) and T{C'). 

2. If (X', C) is an L*-space then f is continuous for C and C if and only if f is 
continuous for T{C) andTiC'). 

We need to study subspaces in some detail. 

Definition 3.8. Let B <Z X. If T is a topology on X, then the subspace topology 

T\b := {Bnu\Uer} 

is defined. If C is a convergence on X, we have the subspace convergence 

C\b := Cn{B^ xB). 

Remark 3.9 (Subspaces). If C is a convergence on X and B G X, then the inclusion 
T(C)|_B C T{C\b) holds (this is easy to prove). If {X,d) is a metric space with 
convergence defined in Example 3.3, then T{Cd)\B = 'T'{Cd\B) holds for arbitrary 
subsets B G X. This follows from the fact that the open disks B{x, e) for e > are 
a base at x e X. We will generalize this idea in Lemma 3.16.2. 

We consider closures in a sequential space. 



3 INFORMATION TOPOLOGIES ON THE STATE SPACE 



25 



Definition 3.10. Let {X, C) be a sequential space. The sequential closure olY <Z X 
is 

c 

c\c{Y) := {x e X \ (a;„)„eN — > x for a sequence {xn)nen C Y] . (35) 
The following property is suggested in [Du64]: 

e) Xn X and {x^'^^)n Xjn for all m G N imphes that there exists a 
function n : N — >■ N, such that x. 

A weaker property is proposed in Problem 1.7.18 in [En89]: 

e') if Xn — )■ X and for n G N we have (x'-"-')™ Xn, then there exist sequences 
of positive integers ni, n2, . . . and mi, m2, ■ ■ ., such that (a;'^"'^))^^, x. 

The next examples show that sequential closures in L*-spaces of probability 
measures need not be topological closures. 

Example 3.11. The I-/rI-convergence of probability measures in (2) is an L*-con- 
vergence, see [Ha07, Du98]. Harremoes gives the example of a triangle D in F{N), the 
probability simplex (7) of N, where clc(D) C clc{clc{D) holds for the I-convergence 
C. Csiszar and Matus [CM04] discuss an exponential family S of Borel probability 
measures in where clc{S) C clc{clc{S)) holds for the rl-convergence C. 

Topological spaces for which sequential closure is the same as ordinary closure 
are known as Frechet spaces [En89]. Sequential spaces with this property are char- 
acterized by e'): 

Remark 3.12. If [X,C) is a sequential space satisfying e) in Definition 3.10 and 
Y G X then the sequential closure clc{Y) is the T(C) closure of Y. 

In detail, a subset Y (Z X is T{C) closed if and only if cXciY) = Y hy Remark 3.2. 
Hence c\c{Y) is the T{C) closure of Y if and only if clc(clc(l')) = c\c{Y). The 
latter equation being true for all y C X is easily shown to be equivalent^ to e'). 
The argument is complete since e') follows from e). 

We generalize metric spaces, one aspect is to allow infinite "distances". 

Example 3.13. The positive augmented half-line [0, oo] = [0, oo) U {oo} is con- 
sidered a topological space with the Alexandrojf compactification of the positive 
half-line [0, oo). Open sets are of the form [0, oo] \ F, where F is a norm compact 
subset of [0, oo), together with all norm open subsets of [0,oo). Then ([0, oo],T'^) 
is a compact HausdorfF space, see e.g. Theorem 3.5.11 in [En89]. The convergence 
:= C{T^) clearly equals 

{{{xi),x) e [0, oo]^ X [0, oo) \ Xi < oo for large i and lim Xi — x} 

i— )-oo 

U {{{xi), oo) I {xi) C [0, oo] such that Vi? G [0, oo) we have Xi > R for large i} , 

where limj^oo -^i = x in the first term means that the finite valued sequence members 
of (a;i)igpj converge to x in norm. 



This equivalence holds also for convergences that are not sequential. 
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It is easy to show T{C(T'^)) C (every U G T{C(T^)) including oo has a 
bounded complement and with each real x & U there is a disk B{x,e) in U). The 
converse inclusion holds for arbitrary topologies so we have 

It is easy to show that is an L*-convergence, hence we obtain from Theorem 3.7.2 
for any sequential space (X, C) and any function f : X ^ [O.oo] that / is continuous 
for C and if and only if / is continuous for T(C) and T*^. 

We shall frequently write limj_^oo Xi — x in place of Xi — > x for (xj)jgN C [0, oo] 
and X G [0, oo]. Finally we study the example that will clear some questions about 
the I-/rI-topology. 

Definition 3.14 (Divergence functions). 1. A divergence function on a set X is a 
function / : X x X — )■ [0, oo], such that for all a; G X we have f{x,x) — 0. Let 
Cf he the convergence on X defined by 

Xn ^ X : \ ',' lim„_^Qo f{x, x^) — . 

2. Two assumptions will suffice for our purposes to analyze the I- /r I- convergence: 

A) (X, d) is a metric space and there is a function g : [0, oo] — ?■ [0, oo] such that 
for all x,y E X we have d{x,y) < g{f{x,y)). The function g is continuous for 
C^ and C= at and ^(0) = 0. 

B) For all X G X the function X ^ [0, oo], y i-^ f{x, y) is continuous for Cf and 



3. For X G X and e G (0, oo] we define the open f-disk 

Vf{x,e) {yeX\f{x,y)<e}. 

and the closed f-disk 

Wf{x,e) := {yeX\f{x,y)<e}. 

Remark 3.15. Property B) fails if / is the relative entropy discussed in the intro- 
duction (3) on the probability space (N, 2^). Property A) holds for that case due to 
the Pinsker-Csiszar inequality. 

We study divergence functions satisfying A) or B). 

Lemma 3.16 (Divergence functions). Let f be a divergence function on a set X . 
The convergence Cf is a sequential convergence with property d) in Definition 3.L 
The sequential closure ofYdX is 

clc.iY) = {a; G X I lim f{x,yn) = for a sequence (?/„)„eN C Y} 

= {x G X I inf f{x, yn) ^ for a sequence (y„)neN C ¥} 

= {xGX| inf /(x,y) = 0}. 
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1. Let f satisfy property A) in Definition 3.14-2 for the metric c? : X x X — )■ M. 
Then Cf is an L*-convergence, in particular C{T{Cf)) = Cj. We have Cf C Cd 
and T{Cf) D TiCd), in particular T{Cf) is a Hausdorff topology. 

2. Let f satisfy property B) in Definition 3.14.2. Then for all x E X the function 
X — )■ [0, oo], y h-)- f[x,y) is continuous for l~{Cf) and T^. For each x E X 
and e G (0, oo] the open f-disk V'^{x,e) is T{Cf) open and the closed f-disk 
W^{x,e) is TiCf) closed. The open f -disks {V^{x,t) | e > 0} are a base for 
(X, T{Cf)) at X. In particular T{Cf) is first countable and for any subset Y <Z X 
we have T(C/)|y = T(C/|y). The sequential convergence Cf has property e) in 
Definition 3.10, in particular for any subset Y d X the sequential closure clcfiY) 
is the TiCf) closure ofY. 

Proof: The statements in the preamble are clear, we show part 1. To prove 
that Cf is an L*-convergences, it suffices to show condition c) in Definition 3.1. Let 

X E X and (xj)jgpj C X. Assuming x„ — ^ x, i.e. limj_^oo fi^^ ^i) — 0, the continuity 
of g at zero (for C^) gives 

Imvi^^go f{x,Xi) = 0. 

For all i e N we have d{x., Xi) < g o /(x, Xi). Hence limj^oo d{x, Xi) = or — ^ x 
hkewise. We have proved Cf C Cd- If follows immediately that T{Cf) D T{Cd) and 
since T{Cd) is Hausdorff, so is T{Cf). 

If for a second point y E X we have Xn — > y, then we get 

d{x,y) < limi^ood{x,Xi) +limi^^d{xi,y) = 0. 
This shows x = y and proves that C/ is an L*-convergence. Now Theorem 3.5 shows 

c{r{Cf)) = Cf. 

We prove part two. For all x e X the function X ^ [0,oo], y i— )> f{x,y) is 
continuous for Cf and by assumption B). The discussion in the last paragraph of 
Example 3.13 shows that this function is continuous for T(C/) and T'^. Hence the 
preimage of every T'^ open resp. closed subset of [0, oo] is T{Cf) open resp. closed. 
In particular, every open resp. closed /-disk is T{Cf) open resp. closed. The open 
/-disks {V^{x,e) | e > 0} define a base for (X, T(C/)) aX x E X: By contradiction, 
if [/ is a TiCf) neighborhood of x and U contains no open /-disk about then 

there exists a sequence {xi)i^^ d X \ U with {xi)i^^ — )■ x. But X\U is T{Cf) 
closed and hence contains all C/-limits of sequences m. X\U. x E X\U and U 
is not a T{Cf) neighborhood of x. 

The space (X,r(C/)) is first countable, e.g. {Vf{x,l/n) | n G N} is a base at 
X E X. If r C X, then T{C)\y C T{C\y) holds, see Remark 3.9. Conversely, for 
all 7/ e F and e > we have 

Vf\^-^{y,e) = Vf{y,e)r\Y. 

The divergence function /|yxy on Y satisfies B), hence a set C/ e T(C|y) equals 
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for some Ha and > 0, a e /. We have proved U e T(C/)|y. 

c 

We prove property e). If {xi)i^^ — > x then there exists a sequence of positive 

numbers (ei)jgN ^—^ 0, such that /(x, Xi) < for large i. For every i e N let 

us choose an arbitrary sequence {x^j)j^^ C X such that (xp^gN — ^ ^^j. By B) 
/(x, •) is continuous for Cf and C"^, hence for large i there exists mj G N such that 
/(x, X* ) < ej for all j > rrii. Then f{x, x^.) < ej for large i implies 

]imi^oof{x,xl^.) < limj^ooQ = 0. 

This proves property e) of Cf. A consequence for any Y G X is that clcj,(i^) is the 
TiCf) closure of Y (see Remark 3.12). □ 



3.2 The I-topology and the rl-topology 

The relative entropy 5" : «S x 5 — > [0, oo] defines two divergence functions. The asso- 
ciated topologies recognize the convex geometry of the state space S: The relative 
interiors of faces of S are connected components of the I-topology and on each com- 
ponent the I-topology is the norm topology. The rl-topology fits exactly the needs 
of the complete projection theorem Theorem 4.15. Of course S is homeomorphic to 
a unit ball in the norm topology, a result known as Theorem of Sz. Nagy, see e.g. 
§VIII.l in [Be63]. Corollary 3.22 collects conditions for a commutative algebra. 

In the sequel let u G {I,rl}. Several important definitions and a summary of 
similarities between the cj-topology and a norm topology are given in §1.4. Defi- 
nition 1.6 introduces for p, o" G and X C Sjs, the functions S^{p,a) = S{a,p), 
S'\p,a) = Sip, a) and S^{p,X) = inf^^x a). 

Definition 3.17. Let {pi)i^n C »S be a sequence and let p e S. We define the 
u} -convergence on S by 

Pi — > p : <^ linii^oo S'^ip, Pi) = 

We begin with continuity of the relative entropy using on the positive augmented 
half-line [0, oo] the L*- convergence C^ of the Alexandroff compactification in Exam- 
ple 3.13. 

Proposition 3.18. For every state p E S the mapping S [0,oo], a S'^{p,a) 
is continuous for C"^ and C^. 

Proof: Concerning the I- convergence, we have to show for p,a E S and (7:j)jgpj C 
S that limj^oo S{Ti, cr) = implies limj^oo S{Ti, p) = S{a, p). Let us first assume that 
s{p) ^ s{a) holds, i.e. S{a,p) < oo. Since limi^oo S{Ti,a) = we have s{a) ^ s(ri) 
for large i and hence s{p) >z s{Ti) holds for large i. By the Pinsker-Csiszar inequality 
(13) the sequence {Ti)i^^ converges to a in norm. Hence the continuity of the von 
Neumann entropy, see e.g. §11. A in [We 78], proves 

S{Ti,p) = -S{Ti) -tTTilog{p) -S{a) -tr a log{p) = S{a,p). 
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Second, we consider s(p) ^ s{a), i.e. S{a,p) = oo. By Remark 2.6.1 the relative 
entropy is lower semi-continuous. We obtain liminfj_^oo 'S'(Tj, p) > S{a,p) — oo and 
this implies limj_>.oo S{Ti,p) = oo. 

Concerning the rl-convergence, we have to show that limj^oo S{a, Ti) = implies 
limj_^oo Tj) = S{p,a). If s{p) 7^ s{a) then S{p,a) — 00 and the lower semi- 
continuity of the relative entropy proves limj_>oo S{p, Ti) = 00 as in the previous 
paragraph. Finally we consider s{p) ^ s{a) with S(p,a) < 00. Since S(a,Ti) '-^^ 
we have s{a) ^ s{ri) for large i. Perturbation theory used in Lemma 2.9 completes 
the proof. □ 

Norm topology is too coarse for a continuity result similar to Proposition 3.18. 

Example 3.19. li A = then 5 ((^, (1, 0)) = 00 for all n e N while 
S ((1, 0), (1, 0)) = 0. If ^ = Mat(2, C), then for real a we have 



S (|(l2 + CTi), ^(l2 + cos{a)ai + sm{a)a2)) 



if a = mod 2n, 
00 else. 



A less trivial example is Example 9 in [KWll]: Any non-negative limit of S{p,aa) 
(or divergence) can be achieved for smooth paths cTq converging in norm to an 
arbitrary point p in the boundary of the Bloch ball »SMat(2,c)- 

Taking the cu-closure of a set does not decrease the relative entropy. 

Theorem 3.20. Let p e S and X gS. Then S'^{p,X) ^ S'^{p,cr{X)) holds. 

Proof: For every state a G cl^{X) there exists by Definition 1.7 a sequence 
(ci)ieN C X, such that ai — > a. Proposition 3.18 shows that the relative entropies 
liiiii^ao {P: f^i) — S^{P:(^) converge. Hence 

S^ip,X) = inf,gx^"(p,T) < inf,eN5"(p,a,) < liuii^^ S^{p, a,) = S^{p,a). 

Taking the infimum over all a G c\^{X), we get S{p,X) < S{p,c\^{X)). The con- 
verse inequality is trivial. □ 

We now investigate the a;-topology of the state space S. The face lattice T of 
the state space S was introduced in §2.4. Several concepts around the cj-topology 
were already introduced in §1.4. For example, the topology T'^ = T(C"^) is the 
u-topology on S according to Definition 1.8 and Definition 1.9. We denote the norm 
convergence on S by C"'" and the norm topology on S by T"'" = 7~(C"'"). The 
preamble of Lemma 3.16 proves for subsets X C S that the sequential closure (35) 
equals the w-closure defined in Definition 1.7 and denoted by cl^{X). We have 



cl'^(X) = {p E S \ hm S'^{p,pi) = for a sequence (pj)ieN C X} 
{pe5|inf5-(p,p,) = 

{peS\S^{p,x)^o}. 



— {p e »S I inf S^(p, Pi) = for a sequence (pJjgN C X} 
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Theorem 3.21. 1. The convergence C''^ is an L*- convergence, in particular C {T'^) — 
C"^. We have C"^ C CH'II andT"^ D V^'W, in paHicular T'^ is a Hausdorjf topology. 

2. For every p & S the mapping <S — >■ [0,oo], a ^ S'^{p,a) is continuous for C"^ 
and and for T'^ and T'^ . For each p E S and e G (0, oo] the open oj-disk 
K'^(p, e) is open and the closed u-disk W^'^(p, e) is closed. The open u- 
disks {V'^{p, e) I e G (0, oo]} are a base for {S, T'^) at p. In particular T'^ is first 
countable and for any subset X d S we have T^\x = T{C'^\x)- For any subset 
X <Z S the sequential closure cl'^(X) is the T'^ closure of X. 

3. Every term in the partition S = IJ^g^^^riF is a connected component of S. 
For all faces F E J- we have C^\nF = (^""iriF andT\iF — T"'"|riF- 

4. We have r"-" C C and CH'II D C'^ D C\ 

5. We have cr^(ri<S) = S, in particular the topological space (S,T''^^) is connected. 

Proof: The divergence functions S^{p,a) := S{a,p) and S^^{p,a) := S{p,a) 
defined for p,a E S satisfy the condition A) in Definition 3.14.2 by the Pinsker- 
Csiszar inequahty (13) and they satisfy the condition B) by Proposition 3.18. Hence 
Lemma 3.16 part 1 and 2 prove part 1 and 2. 

We show part 3. According to part 2, for every p G 5 the open I-disk of infinite 
radius is open and has by Remark 2.16 the form 

V\p,oo) = {cr G 5 I S{a,p) < oo} = {(7 G 5 | s{a) ^ s{p)} = F(s(p)) . (36) 

By the lattice isomorphism ¥ : V ^ in Corollary 2.15 we obtain that every face 
F of 5 is open. Let us show that riF is open. The complement S\F is 
closed and the relative boundary rbF of F (in the norm topology) is norm closed. 
By part 2 we have T"" C hence rbF is closed and 

riF = 5 \ (rbF U («S \ F)) 

is open. Now by the stratification (30) the relative interior riF = S \ [\GeT riG 
is closed. 

Let F G J-" be an arbitrary face. Since the relative entropy is norm continuous 
on riF x riF we have C""|riF C C^|riF and the converse inclusion follows from the 
Pinsker-Csiszar inequality. Hence CH'I'IriF — C^\riF follows. With part 2 we have 

r^lriF = r{C%F) = T{C^%iF) = V%iF. 

To show part 4 we begin with a proof of 7^^ C We first notice C^^jriF = C^\riF 
for every face F of 5. This follows from C"" |riF = C^\riF proved in part 3 and from 

can be proved analogously. By part 2 we have 

TiC'XiF = ric%;F) = ric%,F) = ric%,F. 

Let U G T'^^. Then U PlriF G T'^^lriF = 7~^|riF and since riF is open by part 3, this 
shows n riF G T^. Now U = \Jp^jr{U n riF) G and we have proved T''^ C T^. 
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Part 2 adds the inequality T"" C T^^ C T^. By part 1, these topologies arise from 
L*-convergences hence we get C"" D C"'^ D from Theorem 3.5. 

We prove part 5 and we first show that any non-empty T^^ open set U G S must 
intersect riiS. Let p E U, then by part 2 there is e > such that U contains an 
open rl-disk V'^^{p,e) — {a E S \ S{p,a) < e}. We show that V'^^{p,e) intersects 
nS. The norm closure S = riS contains p hence there is a sequence (pi)ieN C ri5 
such that p = limj_^oo Pi- The relative entropy is lower continuous by Remark 2.6.1, 
so we have 

= S{p,p) = liminfi^oo<S'(p,Pi) . 

Hence {pijieN n V'\p, e) 7^ proves liSnU 7^ 0. 

As shown in the previous paragraph, the relative boundary rb S does not contain 
a 7^^ open set so the 7^^ closure of ri>S equals S. The claim cr^(ri5) = S follows 
because the T^^ closure equals the sequential closure cF^(ri5) by part 2. 

We show that S is T^^ connected. By part 3 and 4 we have T^^\ns = T'^'I'Ih^- 
The convex set ri5 is connected in the norm topology hence in the T"^^ topology. 
This shows that S = cY^{nS) is T^^ connected because the closure of a connected 
set is connected, see e.g. §IV.7 in [Be63]. □ 

We formulate conditions for a commutative algebra. 

Corollary 3.22. //dimc(^) > 1, then C C'^, D andS is notV compact. 
The following assertions are equivalent. 

1. A is commutative, 4- C"^^ — C'""; 

2. is second countable, 5. T^^ = T"'", 

3. T'^^ is second countable, 6. S is T^^ compact. 

Proof: Item 1 implies 4. If A is commutative, then by (65) it is isomorphic to 
C'^. We can argue by convergence in components of and find C"'" = C^^. 

We prove the statements in the headline. If dime (.4.) > 1, then by (65) A 
contains a C*-subalgebra B = C"^ and by Example 3.19 we have C^\sb £ 
while C^^\sb = C*"" l^e was shown in the previous paragraph. So C follows and 
we have also C because C"^ = C(r'^) holds for u e {I, rl} by Theorem 3.21.1. 

We show that S is not compact if dimc(vA) > 1. Theorem 3.21.3 shows 
inS,r%s) = inS,r"Us)- But (ri5,r'l-'l| ri^) is not a compact topological space 
since hS is the relative interior of a convex set of dimension > 0. Then S is not 
compact because riiS is its connected component. 

Item 5 imphes 3 and 6. By Proposition 2.14 the state space 5 is a convex body, 
hence is a norm compact metric space. On the other hand, a compact metric space 
is second countable, see e.g. §V.4-5 in [Be63]. Since T"^^ = T"" is assumed, the state 
space is T"^^ compact and T^^ second countable. 

Item 1 implies 2. For every face F E J-' we have T^|riF = T""|riF by Theo- 
rem 3.21.3. As shown in the previous paragraph, T""|f is second countable. Since 
riF is an open subset of F, the topology T^lriF = 7~""|rii.' is second count- 

able. The simplex S is partitioned into finitely many relative interiors riF of faces 
F by (30). Since each of these sets is a connected component of S, the proof is 
complete. 
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We prepare an argument to show that each of items 2, 3 or 6 imphes 1. By 
Remark 2.16 we can write for any state p & S the open rl-disk of infinite radius in 
the form 

V'\p, oo) = {aeS\ s{p) ^ s{a)} = \J p^v riF(p) . (37) 

Here V denotes the projection lattice of A. The open rl-disks are 7^^ open by 
Theorem 3.21.2. For pure state p,q e V OS we have 

p V^^{q, oo) if p 7^ g . 

If A is non-commutative then A contains a C*-subalgebra isomorphic to Mat(2, C), 
see (65), hence V nS is uncountable infinite. 

Item 6 implies 1. The open cover UpePn5 ^'^^l^'' V S has no finite 
subcover. 

Item 3 implies 1. If ,B is a base of T^^, then for all p eV (iS there is a open 
set Up E B such that p E Up C V'^^^p, oo). The map V nS ^ B, p ^ Up is injective. 
This prove that B is not countable. 

Item 2 imphes 1. Theorem 3.21.4 shows T^^ C so the arguments in the 
previous paragraph apply unmodified. 

Item 4 and 5 are equivalent. By Theorem 3.21.1 we have C"^^ = C{T^^) and 
= (^(T"") holds because norm convergence is an L*-convergence. On the other 
hand = T{C'^) and r"'" = r(ClNI) hold by definition. □ 



4 Exponential families in a matrix algebra 

The complete projection theorem for an exponential family £^ in a C*-subalgebra A of 

Mat(n, C) is proved in §4.2. It is connected to the rl-topology and will be formulated 
in terms of the rl-closure cY^{S) defined in §1.4. The resulting rl-projection onto 
cf^{£) is used in §4.3 to study local maximizers of the entropy distance dg defined in 
(12). In §4.4 we discuss non-commutative phenomena of the entropy distance and of 
the mean value parametrization of cY^{£). We also make a continuity conjecture for 
d^:. In §4.5 we prove the complete Pythagorean theorem and we maximize the von 
Neumann entropy under linear constraints providing previously unknown solutions. 

The analysis is based on the mean value parametrization of £ developed in §4.1. 
Prom Definition 1.4 we recall the real analytic function 

Ra'- Aa Aa, R{0) = Ra{0) = exp_4(a)/ tr(exp^(a)) . 

Throughout this section we consider a non-empty afhne subspace © C Aa, its 
translation vector space 

u := lin(e) = e - e 

and the exponential family 8 := i?^(9). Using orthogonal projection vr^/ : U, 
the complete projection theorem implies the bijection T^ulcV^^e) '■ cl'^^(^) ~^ M.(U) to 
the mean value set M{U) = 7r[/(5) defined in (4). 



4 EXPONENTIAL FAMILIES IN A MATRIX ALGEBRA 



33 



4.1 The mean value chart 

We generalize the identity chart of the manifold of invertible states ri<S in a C*- 
subalgebra A of Mat(n,C) to the mean value chart of an exponential family. Its 
inverse is the real analytic mean value parametrization of the exponential families. 

We recall from §2.3 in [Well] that the relative interior of the state space consists 
of all invertible states, 

riS — {p e S \ exists in A} (38) 

and that ri<S is open in the norm topology of Ai = {a E Asa. \ tr(a) = 1}. 

Restrictions to affine subspaces of are the rule in subsequent arguments, 
hence we accept relatively open convex subsets of (in place of open subset of 
M'') as domains of differentiable maps and as ranges of diffeomorphisms and charts. 

Proposition 4.1. Let t ^ U for the multiplicative identity 1 of A. 

1. The projectiomru{S) of £ onto U is open relative to U andiTu^RAle '■ © ~^ T^ui,^) 
is a real analytic diffeomorphism. 

2. If Q has codimension one in Asa, then Rj\\q : G ^ ri»S is a real analytic diffeo- 
morphism. 

3. The bijections (-R^|e)"^ : S -t- Q and nuls ■ S itu{^) are global charts for S 
and {hjjIs)'^ : nu{S) S is real analytic. 

Proof: In part 1, the derivative of .Asa — )■ M, a i— )■ trexp_4(a) can be computed 
from (22) using cyclic reordering under the trace. For a, ti e ^sa we have 

||t=otrexp_4(a + iii) = {u,ex.p^{a)) . 

Hence the free energy (Definition 1.12) has the derivative for 6,u E Asa. 

§i\t=oFAi0 + tu) = {u,RAie)). (39) 

Prom the product rule and (22) we get 

§i\t=oRA{o + tu) = RA{ey-yuRA{e)ydy - {u, RA{e))RAie) . (40) 

For 9,u,v G ^sa we consider the real symmetric bilinear form 

{{u,v))e := ^t\s=t=oFA{0 + su + tv). (41) 

If restricted to ^ e © and u,v e U this bilinear form is called BKM-metric, see 
Remark 4.2. We obtain from (39) and (40) 

{{u,v))o = {u, §i\t=oRA{e + tv)) = {^{u,y),^{v,y))dy (42) 
with the not necessarily self-adjoint matrix 

Ciu,y) := RA{e)2[u-{u,RAmnRAiO)'-^ . 
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We have {{u, u))0 > unless u e is a (real) scalar multiple of 1. Hence {{■,■)) e is 
a non-degenerate bilinear form on every linear subspace of Asa. not containing 1 , in 
particular on U. 

Since R\e is real analytic, the composition nu o R\q with the orthogonal pro- 
jection to U is also real analytic. If {uij^^^ is an orthonormal basis of U then the 
directional derivative at ^ e © along u E U ishy (42) 

§-^\t=o^uoR(e + tu) = Tru{§i\t=oRiO + tu)) = E^=l{{u,u^))eu,. (43) 

Since ((•, ■))g is non-degcncrate on U, the Jacobian of 71ijoR\q is invertible everywhere. 
Then the inverse function theorem implies that ttu o R\q is locally invertible and its 
local inverses are real analytic functions, see e.g. §2.5 in [KP02]. This imphes that 
the image nu o R(Q) is an open subset relative to U. 

On the other hand, the global injectivity oi ttijoRIq follows immediately from the 
projection theorem (11): If there are 6,6' E Q such that TijjoR{d) = tijjoR{9'), then 
R{e) = R{e'). Taking the logarithm on both sides one has 9 - F{e)t = 9' - F{e')t 
so the difference 9 — 9' is proportional to 1. Hence ^ = ^' by the assumption 
t ^ U. This completes the proof that ttu o R\e '■ © — >■ t^u{^) is a real analytic 
diff eomorphism . 

In part 2, if 6 = is the space of traceless matrices, then 

logo : ri5 ^ A, P ^ log(p) - ^1 (44) 

is inverse to R\ao ^■nd this shows R{Q) = ri<S. Since R{9 + 1) = R{9) for all 9 e 

we have R{Q) — riS for every affine subspace © C Asa, of codimension one and with 

1 01in(©). 

In part 3, by virtue of the real analytic diffeomorphism in 1 it is sufficient to 
prove that RaIo : ^ ^ is a real analytic bijection. The function R^. is real ana- 
lytic by definition and -R^|e is invertible on £^ by 2. □ 



Remark 4.2. If has codimension one in .Asa and if 1 ^ ^7 then the scalar product 
(42) defined at 9 E ior u,v E U is the BKM-metric, a Riemannian metric on 
© = ri<S named after Bogoliubov, Kubo and Mori, see e.g. [Pe08]. Indeed, 

:= li\t=oRA{0 + tv) 

is the (— l)-representation of a tangent vector in the identity chart id : ri»S ^ ri»S. 
The (-l-l)-representation of u^^^^ = ^|t=o-R^(^^ + tu) equals 

D\og^{u(-'^) = §-^\t=o^og^oR^{9 + tu) ^ u + Xt 

for some A e R. Since v^~^^ has trace zero, we arrive at the mixed representation 

{{u,v))e = tr(M(+i)^;(-i)) 

of the BKM-metric, see e.g. [GSOl]. 

Let us calculate the range of the chart ttuIe- The following statement gives us 
also an upper bound on the norm closure S. It is used implicitly for linear © in 
Lemma 7 in [Wi63]. 
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Proposition 4.3. We QSSUTne (2^j)neN C and that the states pi : — RAi-^i)> ^ ^ -N; 
converge in norm to p = liiiij^oo Pi- Then p & S or limj^oo = oo. In the second 
case, the set M C U of accumulation points of (p^)ieN is non-empty and we have 
s{p) di for every u G M. 

Proof: If the sequence {xi)nen has a bounded subsequence then it has an accu- 
mulation point in G. Descending to a subsequence we assume the convergence of 
X := liiRi^^Xi e 6. By continuity of R^ we have 

p = hmj_^.oo -R^(xj) = Ra{x) e 

Otherwise, if p ^ £^ we have hmj_>.oo = oo. We fix for all following arguments 
an accumulation point u of (||f^)ieN (clearly u e U). Let $ : ^ — >■ Mat(A^, C) be a 
C*-algebra embedding with ^(1) = Iat- Weyl's perturbation theorem (25) proves 
that the largest eigenvalue of $(p^) converges to the largest eigenvalue of $(m). 
Since the largest spectral value of a e ^ is the largest eigenvalue of $(a), we obtain 

lim,^ooA+(^) = A+(«). (45) 

Let us assume u is not a real multiple of 1 (otherwise p'^{u) — t and the 
statement s{p) ^ P^i"^) trivial). We put 

yi := Xi - A^(xi)l 

for i e N. Using oc as proportionality by a positive real number, we have for i e N 

y, = x,-X2x,)t oc ^-A+(^)l. 

Since (p^)ieN converges to u and since u is not proportional to 1 we have \\yi\\ > 
for large i. Hence (45) implies 

Since A+(?/j) = 0, we have ||e^'|| < 1 for all i G N. Hence it is possible to select a 
subsequence of {yi)ieN, which we call by the same name, such that 

u := limj^oonUii and a := \imi^^exp^{yi) . 

The argument in the first paragraph (applied to S — i?^(0 + Ml)) shows 
limj^oo Ill/ill = oo because 

p = liuii^^ Rj^{xi) = limi_^oo -R^(yi) 

Now Lemma 2.10.1 shows s{a) ^ P^('w)- Clearly P = ^ holds and we finish by 
proving p'^{u) = p'^{u), which follows from (46). □ 



A statement like the following is implicitly used in Theorem 2 b) in [Wi63]. For 
completeness we provide a proof. 
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Lemma 4.4. Let f : V ^ W be a continuous map between two finite-dimensional 
real vector spaces. Let K G V be non-empty and bounded, L G W be connected and 
f{K) G L. Iff{K) IS open and f(K \K)nL = ^, then f{K) = L. 

Proof: Since f(K \ X) n L = we have L \ f(K) = L \ f(K) ^(W\ f{K)) n L 
and f{K)nL^ f{K)n L, hence 

L = {f(K) n L) U (L \ f(K)) = {f{K) nL)U {{W \ f{K)) L) . (47) 

The set f{K) is open in W by assumption and since f{K) is compact W \ f{K) is 
open in W. Since f{K) n L 7^ by assumption, (47) is a disconnection of L unless 
L \ f{K) = 0. Since L is connected by assumption, f{K) D L follows. □ 

We have collected all arguments needed to compute Tru{S). The mean value set 
M{U) plays a crucial role (4). 

Theorem 4.5. Let 1 ^ U. Then riM^(C/) is open in the norm topology of U and 
the chart change nu o Rj^\q : © — >■ riM_4(C/) is a real analytic diffeomorphism. We 
have nu(S \ £) = rbM^(C/). 

Proof: The map ttu o R\q : © — > Tru{S) is a real analytic diffeomorphism by 
Proposition 4.1.1 and 7ru{S) is open relative to U. We shall first show 

7ru(S\S) G rbM(f/). (48) 

Let p G S \ S. Proposition 4.3 shows that the support projection of p satisfies 
s(p) ^ p'^{u) for a non-zero u G U and Proposition 2.14 shows that p lies in the 
exposed face F(p+(m)) = Fj_(5,m) of the state space. Then Lemma 2.18 shows that 
T^u{p) hes in the exposed face F^{Ml{U),u) of the mean value set. The mean value 
set M(f/) has non-empty interior because it contains nui^) and then Theorem 13.1 
in [Ro72] proves that the exposed face Fi(M(C/), u) is included in the boundary of 
M{U). This proves nu{p) G rbM(f/). 

In order to prove that Tiu ° R\e '■ © hM.{U) is a real analytic diffeomorphism 
it suffices to prove nu{£) = nM{U). The convex body M.(U) is the projection of 
the whole state space, so ttu{S) G M{U). But £ G ri(<S) holds by (38) and thanks 
to the equality tt^ o n{S) = rio7r^(5) (see e.g. Theorem 6.6 in [Ro72]) we have 
7r[/(£) C riM(6'^). We meet the conditions of Lemma 4.4 with 

y = Aa, W^U, f^TTu, K^£ and L = riM(C/). 

Indeed, K = £ G S i?, non-empty and bounded, the convex set L = riM(f/) is 
connected. We have proved in this paragraph that f{K) = nu{£) is included in 
L — nM{U) and f{K) — 7ru{£) is open relative to W — U as the range of the 
diffeomorphism nu o R\q. Moreover 7ru{£ \£) C. rbM(C/) in (48) implies 

f(K\K)nL = 7ru{£\£)nhM{U) G ThM{U)nhM{U) = 0. 



Then ttu{£) = riM(C/) follows. 
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Figure 4: The Swallow family (left) and Staffelberg family (right) are sketched by 
e-geodesics. The cones about the famihes replace the state space. The mean value 
sets are projections of the state space, their boundaries are drawn below. 



Finally we show nu{S \S) = rbM(?7). Since S G S is compact and 7iu{S) — 

nM{U) we have 

M{U) = nu{£) C nu(£) C M{U) 

hence 7ru{S) = M{U). Then nuiS) = riM(C/) proves 7ru{S\£) D rbM(C/). The 
opposite inclusion is (48). □ 

In the sequel let/ will naturally occur in our constructions. 

Corollary 4.6. The map nu\£ : £ — >■ riM(C/) is a bijection and the inverse (vrj/lg)"^ : 
riM(t/) £ is real analytic. 

Proof: If 1 ^ ?7 then the claim follows from Theorem 4.5 and Proposition 4.1.3. 
We assume 1 G f/ and we define 6o := 7r^o(®) ^^"^ •= T^AoiU). We have 
f/o = lin(eo) and since = holds, we have f/ = [/q + 1^ and 9 = Go + IM. 
Clearly £ = i?^(0) = Ra{^q) and {tcuqIs)^^ '■ riM(t/o) ^ ^ is a real analytic 
bijection because 1 ^ Uq. We have ttu — t^Uq + ti'ir and 

Hence i\M{U) = riM(f/o) + ^l holds and for u G riM(C/) the equality (ttuIs)''^ (u) = 
{'^Uo\£)~^{u ~ tri'^^ completes the proof. □ 

Definition 4.7. We call the continuous bijection iiulg : £ riM_4(?7) in Corol- 
lary 4.6 the mean value chart of £. The real analytic inverse {7ru\£)~^ '■ riM^(C/) — >■ £ 
is the mean value parametrization of £. 

The mean value parametrization in a non-commutative algebra is demonstrated 
in Figure 4 with the simplest non-trivial examples: 

Example 4.8. The Swallow family, studied in [KWll], is the exponential family 

i?(spanjR((Ti © 1, (72 © 1 )) 

in the algebra Mat (2, C) © C. The Staffelberg family was defined in Example 1.5 as 
i?(span]g((Ti©0, cr2©l )). These exponential families are included in the exponential 
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family R{V) for V := span(o"i © 0, (T2 © 0, — 12 © 2). The norm closure R{V) is the 
3D-cone drawn about each family. It is shown in Example 6.3 in [WelO] that R{V) 
consists of all matrices with real coefficients a, b, c 

/ a b—ic \ 
M b+ic a 

V l-2a/ 

such that M ^ 0. \i £ denotes one of the above families, let U := logo(^) be the 
linear subspace of ^0 such that £ = R{U), see (44). It is known that 

M{U) = 7ru{R{V)) = 7r[/(5Mat(2,C)ec) = 7r[/(5Mat(3,C)) ■ 

In fact, to apply Lemma 3.13 in [Well] we should first use an isometry that permutes 
(72 ® with (T3 © and makes all entries of matrices in U and V real. 



4.2 The complete projection theorem 

We extend the projection theorem for an exponential family £, explained in §1.3, 
to the entire state space Sa- This will be done by defining an extension exk{£) 
of £ composed of exponential families in compressed algebras pAp for orthogonal 
projections p, one for each face of the mean value set M(f/) (of the vector space 
U). As turns out, the extension cxt(£) equals the rl-closure cV^{£). We obtain a 
bijective mean value paramctrization 

{nuU^^S)): M{U) ^ cV\£) , 

whose continuity will be discussed in §4.4. We obtain an rl-projection with linear 
fibers 

ns: Sa ^ cV\£) 

from the state space Sa to the rl-closure. We finish by providing analogues of the 
natural and canonical parameteTS known in statistics, both for the exponential family 
£ and for its rl-closure. 

For orthogonal projections p eV the orthogonal projection of ^sa onto {pAp)sa. 
was denoted in (34) by 

: Aa (p^p)sa , a H- pap . 

Only the projections in the lattice are interesting. They correspond to the faces 
the mean value set M([/) by (32) and can in principle be computed by spectral 
analysis, see Remark 2.26. 

Definition 4.9. For orthogonal projections p e V we consider the exponential 
family 

£p ■■= RpApic^m ■ 

The extension of £ is defined in terms of the projection lattice 

ext{£) :— Upe'P'^\{o} ■ 
We begin by parametrizing the extension ext{£) from the mean value set M(t/). 
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Lemma 4.10. The projection Tru\eKt{£) '■ ext(£^) — > M.{U) is a bijection. 

Proof: For every non-zero projection p eV, the mean value chart in Corollary 4.6 
proves the bijection 

Using the third diagram in Lemma 2.23 we have the bijection 

T^u\£, : ^ ri7r[7(F^(p)). (49) 

The map — )■ J^(M(f/)), p t^u(^a{p)) is a lattice isomorphism from the pro- 
jection lattice to the face lattice of the mean value set, see (32). Then the 
stratification (30) of M(t/) into the relative interiors of its faces shows that the bi- 
jections (49) assemble to a bijection ext(£) — )■ M(?7). □ 

We will denote the following extension of the projection TTg from (10) by the 
same symbol TTg and we call it rl-projection according to the result of Theorem 4.15. 

Definition 4.11. The rl-projection to E is well-defined by Lemma 4.10 as the map 

TTg : 5^ ^> ext(£), p H> (7r[/|ext(£:))"^ o7r[/(p) . 

For technical reasons we denote the relative entropy with the first argument p E S 
fixed hy Sp-.S ^ [0, oo], Sp{a) := S{p, a). 

For convenience we cite Lemma 7 and Lemma 10 in [KWll]. The first proposition 
follows immediately from Lemma 2.10.2, the second requires some computation. 

Lemma 4.12. Suppose 9,u E Asa and p := p^{u) is the maximal projection of u. 
We have 

hm Ra{9 + tu) ^ RpAp{(^{0)) . (50) 

t— >oo 

and 

\im{F^{9 + tu)-tX+ia)) = Fp^p{d>{e)) . (51) 

Lemma 4.13. Suppose 6,u E Asa and u is not proportional to the multiplicative 
identity 1 in A. If the state p belongs to the exposed face F±{S_/i,u) of the state 
space, then Sp{Rj[{9 + tu)) is strictly monotone decreasing with t eM. and 

miteRSp(RA{0 + tu)) = liuit-^oo Sp{Ra{0 + tu)) = Sp[limt-^^ Ra{0 + tu)) . 

The next proposition is an intriguing interplay between convex geometry and 
matrix calculus making a statement about the relative entropy Sp{a) — S{p, a) 
along a single curve. 

Proposition 4.14. Let p E S^. and a E ext(£^); such that a ^ TTe{p) and Sp{a) < oo. 
There exists a continuous path 7 : [0, 1] — >■ ext{£) from 7(0) — a to 7(1) = T^sip), 
such that 

1. Sp{'j{t)) is strictly monotone decreasing in t and 
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2. ds{p)<SMl)). 

Proof: We construct 7 by concatenation of several e-geodesics. Since Sp{a) < 00 
we have s{p) ^ s{a). By Lemma 2.27 there exists a projection p e V'^ , such that 
p e riF_4(j9) + [/-*- and then it follows from Lemma 4.10 that vr£-(p) G Sp. We denote 
by g e the projection such that a e Sq, i.e. s{a) — q. By Lemma 2.27 we have 

p = /\{r e I ^ r} ^ s((t) = g. 

By Corollary 2.25 there exists an access sequence of projections for U including both 

p and q, say 

^ = Po y Pi ■■■ y Pm = P: 

where m > 0, p = Pm and g = for / < m. 

We define (m — /) e-geodesic rays in , , . . . , . Prom the Defini- 
tion 2.21.3 of an access sequence and by Corollary 2.20, for each k — 0, ... ,111 — 1 
there exists Uk G c^'=(C/), such that 

Pk+i = P^^pfeK)- (52) 

Moreover, Uk is not a multiple of the identity pk in PkApk (because Pk+i 7^ Pk)- Let 
6* e such that a = RqAqi.c'^id)). We define for A; = 0, . . . , m — 1 the e-geodesic 

Qk-.^^Sp,, t ^ Rp^Ap,ic'"'{0)+tUk). (53) 

By (50) we can define 

(jfc+i := limt^^Qkit) = Rp^+,Ap,+i{c^''+'{0)) e ■ (54) 

After reparametrization t = jz^, each e-geodesic ray 5';|[o,oo), 5'i+i|[o,oo), • • • , 5'm-i|[o,oo) 
is defined on the segment [0, 1). 

We concatenate the reparametrized e-geodesic rays to a continuous curve 7 : 
[0, m — /] — )■ ext(£^). The pieces fit together by (54). If am 7^ ''^sip) then we add the 
e-geodesic segment in Sp from am to Tieip). This is parametrized under Rp^p by a 
straight line segment in c^(G), which we parametrize linearly by the unit interval 
[0, 1]. Since a 7^ T^sip), one of the inequalities m — I > or am ^ T^eip) must be true 
so we obtain a curve 7 : [0, 1] — > ext(£^) from 7 by a parametrization speedup by a 
factor of m — / or m — / + 1. 

We argue that Sp is strictly monotone decreasing along 7. This can be done for 
the rays in the natural parametrization (53) for /c = 0, . . . , m—1. Since p :< Pk+i ^ Pk 
we have by Proposition 2.14 and (52) 

P e ^a{p) C F^(pfe+i) = ¥pkApk{Pk+i) = F^iSp^Apk^Uk) . (55) 

Since Uk is not a multiple of the identity Pk, Lemma 4.13 can be invoked and it 
shows that Sp is strictly monotone decreasing along gk- 

The fact that Sp is strictly monotone decreasing along the e-geodesic segment 
from am to nsip) uses the strict convexity of Sp on Sp in the parametrization of 
RpAp- In order to have an injective parametrization we project c^(0) onto {pAp)o — 
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{a e (Mp)sa I tr(a) = 0} so that ©o := T^ipAp)o{cP{Q)) satisfies £p = RpAp{c^{Q)) = 
RpApi^o)- For all 9o G ©o an elementary calculation shows 

Sp{RpAp{Oo)) = -S(p)-tr{peo) + Fp,,p{eo) 

where S{p) is the von Neumann entropy and F is the free energy from Definition 1.12. 
Hence for uq,vq G 7'"(p,4p)o(c^(^)) = lin(©o) = ©o — ©o the second derivative 

^\s=t=oSp{RpAp{Oo + SUo + tVo)) = ^\s=t=oFpAp{Oo + SUo + tVo)) = {{uo:Vo))eo 

equals the BKM-metric (41). As discussed in the paragraph following (42) the free 
energy has a positive definite Hessian on ©o so Sp o Rp^p is strictly convex on ©q. 
Since "Ksip) & Sp, the function Sp has on £p = ©o a global minimum at TT£{p), this 
follows from the projection theorem (10). Hence Sp is strictly monotone decreasing 
along the e-geodesic from (7^ G Sp to TT£{p) G Sp. 

Second, we prove d£{p) < Sp{Tie{p)) by showing for /c = 0, . . . , m — 1 that 
d£,^(p)<d^,^^^(p) holds. Then 

d£(p) = d£^o(p) < d£,^(p) < ••• < d£^^(p) = de^ip) < Sp{7:s{p)) 

will follow, the last inequality since T^eip) G Sp. Let r G Sp^^^ and let ^ G © such 
that T = -Rp;,+i.4pfc+i(c^''=+i(6l)). Then by (52), (55) and Lemma 4.13 we have 

d£,^(p) < mU^^Sp{Rp,Ap,{cP''{e)+tUk) = Sp{Rp,^,Ap,^,{d'^+^{e))) = Spir) . 

Taking the infimum over all r G Spi^_^_^ the claim follows. □ 

Local minimizers in the following theorem are understood in the norm topology: 
If {X, T) is a topological space, Y <Z X and / : y — )■ R, then G F is a local 
minimizer {maximizer) oifonY, if there is a T neighborhood V oi xq in X , such 
that for all a; G n F we have f{xo) < f{x) {/{xq) > f{x)). We now consider the 
rl-closure cY\S) ^ {p e S \ inf^gg S{p, a) = 0}. 

Theorem 4.15 (Complete projection theorem). We have cY^{S) — ext(£^) and 

'^u\cV^(e) '■ cl'^^('^) ^ M_4([/) is a bijection. For each p E S the relative entropy 
Sp has a unique local minimizer on the rl-closure cV^{S) at the rl-projection T^e{p). 
The entropy distance is (is{p) = Sp{7T£{p)) = min^gj,jri(£) Sp{a). 

Proof: For each p G 5 we observe from Proposition 4.14.1 and from the fact 
that Sp is finite on S, that ^^{p) is the unique global minimizer of Sp on ext(£^). 
By Proposition 4.14.2 all a G ext(£) satisfy d£-(p) < Sp^iTs^p)) < Sp{a). Taking 
the infimum over a G ext(£^) this shows d£:(p) < dext(£:)(p) and since the converse 
inequality is trivial, we have proved 

Mp) = dext(£)(p)- 

Now cV^{S) = cr^(ext(£)) follows immediately, see (12) and Definition 1.7. We show 
cr^(ext ((£")) = ext(£'). The inclusion "d" is trivial. Conversely, let p G cV^{cxt{S)). 
Then dext(£:)(p) = 0. Since the relative entropy is non-negative and since 7r£(p) 
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is a global minimizer of Sp on ext(£^) we have Spi^iis^p)) = 0, i.e. p = Tie{.p)- In 
particular p G ext(£^). We conclude ext(£^) = cY^{S). Now Lemma 4.10 shows that 
T^u\c\^'^[£) '■ cl'^^(^) M^(f/) is a bijection. 

It remains to discuss local minimizers a e cF^(£) of Sp on cY^{£). If Sp{a) < oo, 
then Proposition 4.14.1 shows that a is not a local minimizer unless a — vr£-(p). If 
Sp{a) = oo we observe that £ is norm dense in cY^(S) by the Pinsker-Csizar inequal- 
ity (13). Since Sp has finite values on S, the state a is not a local minimizer. □ 

The following coordinates (Ai,...,Afc) e depend on the knowledge of the 
projection lattice V^, which in principle can be computed by the method of Re- 
mark 2.26. The derivative of the free energy F allows us to compute them. 

Corollary 4.16. Let 9o,ui, . . . ,Uk G ^sa o.nd let Q :— 9o + spanR(Mi, . . . , Uk) (then 
U — span-g^{ui, . . . , Uk) and 8 — i?^(©) ). The mean value map defines a bijection 
from the rl- closure of £ to the convex support 

m„l,...,nJcl--l(£:) : CF^(^) Cs(Mi, . . . . 

For every p G cY^{£) exists a unique projection p G and some (in general non- 
unique) Ai, . . . , Afc G M, such that 

i^,^---^m^)FpAp{c^{Oo + Y^i=iKui)) = mu,,...,uk{p) ■ (56) 

For each solution [p, (Ai, . . . , A^)] to (56) we have RpAp{c^{9o + Yl^=i ^i'^i)) = P- 

Proof: Theorem 4.15 shows cV^{£) = ext{£). If p G cV^{£) then by definition of 
ext(£) there exists p G and Ai, . . . , Afe G M such that for 6* := + S^Li Kui we 
have p = RpApic^i^))- Since s(p) = p, the projection p is unique. The derivative 
(39) of the free energy is for j = 1, . . . , A; given by 

i:FpAp{cm) = {c^{u,),RpAv{cm)) = {u„RpAp{cm)) 
hence the existence part follows from p — RpAp{cf if^)) and the equation 

( A' ■ ■ ■ ' = rn^.,...,nARpAp{c^{e))) . (57) 

On the other hand, if [p, (Ai, . . . , A^)] solves (56), then (57) implies 

mn^,...,uk{RpAp{.cP{.0))) = m«i,...,«fc(p)- 
This implies RpAp{c^{9)) — p since the mean value map restricts to the bijection 

fnuu-,Uk\cV\€) ■ Cl''^(^) Cs(mi, . . . ,Mfc) . 

Indeed, Tr^yl^jri^-^-) : cV^{S) — > M(f/) is a bijection by Theorem 4.15. The claim follows 
because m„j^...^„^ = m„^^...^„^ ottjj holds and mui,...,uk\M{u) ■ M{U) cs(tii, ...,Uk) is 
a bijection by Remark 1.2.5. □ 

The parameters mui,...,nk{p) G of p G cr^(^^) are analogues of natural param- 
eters, the parameters (Ai,...,Ajk) G MJ^ are analogues of canonical parameters in 
statistics, see §20 in [Ce82]. 
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4.3 Local maximizers of the entropy distance 

Maximization of the entropy distance from an exponential family is proposed in 
[Ay02] as a structuring principle in natural systems. We prove two necessary con- 
ditions for a local maximizer of the entropy distance d^- from an exponential family 
S. One condition, an upper bounds on the rank, enforces determinism on the local 
maximizer. The second condition identifies local maximizers as the cutoff of their 
rl-projection. 

The idea to Theorem 4.17 is from Proposition 3.2 in [Ay02] where the assertion 
is proved for a subset of the probability simplex (7) on a finite set Q. The statement 
was proved in Corollary 2 in [MA03] on the whole probability simplex on Q. 

We shall use the fact that a unique face of Sj, exists, which contains a given 
state p e 5^ in its relative interior (30). 

Theorem 4.17. Let p G Sj^ be a local maximizer of the entropy distance d^ from S 
and assume that F is the face of the state space Sj, which contains p in its relative 
interior. Then dim(F) < dim(£^) . 

Proof: We consider the convex set K :— F D {p + U-^). If two convex sets 
X, y C in the finite-dimensional Euclidean vector space (E, (•, •)) share a relative 
interior point, then n(Xr\Y) — n{X)r\n(Y) follows, see e.g. Theorem 6.5 in [Ro72]. 
Hence p e ri(i^) follows. 

If cr G -ftT, then by Definition 4.11 of the rl-projection tt^ we have 'K£{a) — 'Ke{p) 
and Theorem 4.15 allows to rewrite the entropy distance 

d£{a) = S{a,T:e{a)) = S{a,T:£{p)) . 

We have p := s{ne{p)) b s{a) by Lemma 2.27 and with notation of functional 
calculus from Definition 2.3.3 we get 

d£{a) = -S{a)-iYa\og^\'K£{p)). 

The von Neumann entropy S{a) is strictly concave, see e.g. §11. B in [We78]. Hence 
d£;((7) as a sum of a strictly convex function and a linear function is strictly convex 
on K. Since p is a local maximizer of d^- on it is a local maximizer of the strictly 
convex function dg on K. Since in addition p e xi[K) holds, we get K = {p}. Then 

dimF dimC/-^ < dim^g^ = dimC/ dimC/-^ 

follows, hence dimF < dim?7. If we choose a parametrization of such that \ 
(e.g. by replacing by 7r^o(0)) then Proposition 4.1.1 and 4.1.3 show dimt/ = dim£^ 
completing the proof. □ 

The bound in Theorem 4.17 enjoys a quadratic improvement in n G N from the 
commutative algebra C" to the non-commutative algebra Mat(n, C). 

Remark 4.18. In a C*-subalgebra A of Mat(n, C) let p E Sjs, and let F be the face 
of 5^ containing p in its relative interior. Let p :— s{p) be the support projection 
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of p and let rk(p) be the rank of p. Then Proposition 2.14 shows F — Sp^p and 
dim{SpAp) = dim((p^p)sa) - 1 hence 

dim(F) = dim((p^p)sa) - 1 = dimc(p^p) - 1 ■ 

If p is a local maximizer of the entropy distance dg from an exponential family S 
then Theorem 4.17 shows 

dimcip Ap) = dim(F) + 1 < dim{S) + 1 . (58) 

If ^ = C" is the algebra of diagonal matrices in Mat(n, C) we have dimc(p^p) = 
rk(p) hence (58) shows 

rk(p) < dim(£) - 1 . 

If ^ = Mat(n, C) is the full matrix algebra of size n, then pAp is unitarily equivalent 
to the algebra of block matrices Mat(rk(p), C) ® 0„_rk(p) and pAp has dimension 
dimc(p^p) = rk(p)^. Then (58) proves 

rk(p) < ^dim(£:) - 1 . 

The following Theorem 4.19 was first proved in Proposition 3.1 in [Ay02] for a 
subset of the probability simplex on a finite set ^2 and was proved in Theorem 5.1 
in [Ma07] on the whole probability simplex on Q,. In this commutative setting the 
theorem says that a local maximizer P of the entropy distance dg equals the con- 
ditional probability distribution P = Q{-\A), see Remark 1.2.3, of its rl-projection 
Q = T^e{P) where A C ^2 is the support set of P. 

The entropy distance of a local maximizer is a difference of free energies (Defi- 
nition 1.12). We use functional calculus in compressed algebras (Definition 2.3.3). 

Theorem 4.19. Let p G Sj, he any state, p := s{p) he the support projection of p 
and let q :— s{n£{p)) be the support projection of the rl-projection n£{p). Then there 
exists 9 & Q such that 7r£{p) — RqAqiQ^Q)- 

1. If u ^ {pAp)sa. is a traceless matrix, then ^d£-(p + t'u)|(=o = {uAo^^\p) —pOp). 

2. If p is a local maximizer of the entropy distance ds on the state space S^, then 
P = RpAp{pdp) and d^(p) = FgAq{qOq) - Fp^pipOp). 

Proof: The parameter 6 & Q and a projection q G V'^ such that 7r£-(p) = 
RqAq{(ld(l) exists by Definition 4.11 of the rl-projection to £. We us notice p ^ q 
from Lemma 2.27. The theorem is proved in Theorem 22 in [KWll] for q = 1, i.e. 
for 7rg{p) invertible in A, and for © consisting of traceless matrices. All assertions 
are invariant under the substitution oi 9 ^ 9 + Xq for real A, e.g. 

FqAq{q{0 + Xt)q) - FpAp{p{9 + Xl)p) = FqAq{q9q) + A - [FpAp{p9p) + A] 

= FqAqiq^q) - FpAp{p9p) . 

This proves our claim for arbitrary non-empty affine subspaces G C Asa if 5 = 1- 
Otherwise, iiq^^t, then Theorem 4.15 shows dg{p) — d£^{p). We argue analogously 
as before but with the algebra qAq in place of A. This is possible since 7r£{p) is 
invertible in qAq and since p G qAq. The latter is true since s{p) — p ^ q. □ 
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4.4 Non-commutative aspects of exponential families 

Some non-commutative aspects of the I-/rI-topology are collected in Corollary 3.22. 
We shall now consider an exponential family £. This leads to a continuity conjecture 

for the entropy distance. 

The geodesic closure of S is 

cl^^°{S) :— {p E Sa\ p is the limit of an e-geodesic in £} (59) 

and the inclusions (with S the norm closure) 

cF°(£:) c cr\£) c £ (60) 

were already proved in Corollary 12 in [KWll]. The first inclusion follows from the 
arguments used below in the proof of Theorem 4.20. The second inclusion follows 
from the Pinsker-Csiszar inequality. 

We will see that strict inclusions in (60) are only possible for a non-commutative 
algebra A. Examples in ^ = Mat (2, C) © C are the Swallow family with cF''°(£) C 
cY\£) and the Staffelberg family with cY\£) C see §3.3 in [KWll]. 

Theorem 4.20. We have cP°(£^) = cF^(^) if and only if all faces of the mean value 
setWliU) are exposed faces. 

Proof: Using (50) and Corollary 2.20 we can write the geodesic closure of £ in 
the form 

where V^'^ is the exposed projection lattice defined in (31). This is also proved in 
Proposition 8 in [KWll]. On the other hand we have the disjoint union 

cY^{S) = UpeP^\{o}^p 

by Theorem 4.15 and Definition 4.9. By the lattice isomorphism (32) the equality 
P^'"*" = is equivalent to the property that all faces of M^(6'^) are exposed faces. □ 

We discuss the norm continuity of the entropy distance (12). 

Theorem 4.21. We have cY^{S) = £ if and only if the entropy distance dg is norm 
continuous on Sj^. 

Proof: If the inclusion cV^{£) C £^ in (60) is strict, then there exists a norm 
convergent sequence {pi)ieN C cV^{£) with limit p G Sj^ \ cV^{£) in the compact 
state space (see Proposition 2.14). By Theorem 4.15 we have d£{p) > while 
^siPi) = for i e N hence dg is discontinuous at p e <S^. 

Conversely, let us prove that ds is lower semi-continuous if £^ = cV^{£). Since 8 
is a compact subset of ^saj lower semi-continuity of relative entropy (Remark 2.6.1) 
implies lower semi-continuity of the minimum 



<S^ ^ R, p 1-^ min{S'(p, a) : a e £} . 
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This can be proved using a covering of S by open balls, see Theorem 2 on page 116 
in [Be63]. This minimum function equals d^- by Theorem 4.15. 

In order to deduce the continuity of d^- from its lower semi-continuity we re- 
produce the proof of Lemma 4.2 in [Ay02]. Let p G and (pi)ieN C such 
that p — limj_^ooPi- The function r i-> S{T,a) is norm continuous on for every 
invertible state a. Since £ consists of invertible states we have for ell a E E 

limi^oo S{pi, a) = S{p,a) 

and as d£{pi) < S{pi, a) holds for all i e N we get 

limsupi_,^d£(pi) < \imsupi_^^S(pi,a) = S(p,a). 

Taking the infimum over all u e £^ and using the lower semi-continuity of dg we have 

]imsupi_,^d£ (pi) < d£{p) < liminfi^oo d£(pi) . 

This shows lim„_^oo Sslpn) — Sslp) and proves continuity of □ 

Remcirk 4.22. 1. If ^ = C-^ then <S^ is a simplex, hence M^{U) is a polytope 
and all faces of M_4(C/) are exposed faces. Then Theorem 4.20 proves cl^^°{S) = 
cV^(£). Moreover we have T'^ = T"" by Corollary 3.22 hence cV^{£) = £■ In 
addition. Theorem 4.21 shows that the entropy distance dg is norm continuous, 
a result first proved in [Ay02]. 

2. The two non-commutative phenomena in Theorem 4.20 and Theorem 4.21 seem 
to be connected: A set of two-dimensional families in ^ = Mat(2,C) © C has 
revealed that non-exposed faces of the mean value set M_a{U) are generic in the 
sense that they exist in an open subset of the Grassmannian manifold of linear 
parameter spaces Q = U. Exponential families with a (norm) discontinuous 
entropy distance exist only on the boundary of this open set and are associated 
to the creation of non-exposed face of M^(t/). Connections between the two 
theorems may be rigorously studied by methods of convex algebraic geometry, 
see e.g. [RSIO]. Some ideas in this direction have been collected in §2.4 in [Well] 
and in [We]. We are aware that non-exposed faces of M(C/) are not generic for 
2D-planes U in the algebra Mat(3, C), see [RS05]. 

3. The projection ttu\cvI(^£) '■ cr^(^) ^a{U) is a bijection by Theorem 4.15. Since 
this is a continuous function and since its domain and range is compact, it is 
clear that the mean value parametrization of cY^{S) 

{nul,.^s))-' : M{U) ^ cV\£) 
is continuous if and only if cY^{£) = £■ The mean value parametrization 

(nule)-' : nM{U) ^ S 

is real analytic by Corollary 4.6. Similarly, in Theorem 4.26 the assignment 
cs(-ui, . . . ,Uk) — cF^[i?(span('Ui, . . . ,Uk))], ^ t— ^ p{^) can be discontinuous while 
the restriction ri cs(mi, . . . , Uk) i?(span(tii, . . . , Uk)) is real analytic. 
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The norm continuity of the entropy distance in C" may have a counterpart in a 
non-commutative algebra. 

Conjecture 4.23. The entropy distance from 8 is continuous for T^^ and T^. 

Remark 4.24 (Continuity of the entropy distance). 1. Conjecture 4.23 is not im- 
possible because 7^^ D T"" holds by Theorem 3.21.1. The finer topology 7^^ 
may balance the norm discontinuities found in Theorem 4.21. Secondly, in a 
commutative algebra the norm continuity of d^- and the equality = T"", see 
Remark 4.22.1, is a sign of evidence for the rl-continuity. 

2. The method of Theorem 4.21 is useless for any proof of the continuity of dg for 
T^^^ and T'^. This can be said because cY^{S) is not T"^^ compact unless the mean 
value set M([/) is a polytope. Otherwise M(f/) has infinitely many extreme points 
7Tu(¥j[{p)) for certain p G . For such p the open rl-disks V^^{:[^, oo), defined 

in (37), are an infinite open cover of cV^{£) and each is only contained in one 
set of the cover. 

4.5 The complete Pythagorean theorem 

We prove the complete Pythagorean theorem for an exponential family S. We 
apply it to the maximization of the von Neumann entropy under linear constraints 
providing previously unknown solutions. 

The entropy distance d^i^p) of a state p G 5^ from the exponential family S 
equals the relative entropy from its rl-projection ireip) G cV^{S) to the rl-closure 
cV^{S) oi S. In addition, d£{p) is less than the relative entropy from any state 
a 7^ 7r£:(p) in the rl-closure of 



This is proved in Theorem 4.15. Now we compute the value of that difference, 
extending the Pythagorean theorem in §1.3. 

Theorem 4.25 (Complete Pythagorean theorem). If p e S_a and if a lies in the 
rl-closure cr^(^^) of S, then S{p, T^eip)) + S{'Ks{p)i (^) — S{p, a). 

Proof: Theorem 4.15 shows cF^(£) = ext(£). Then by Definition 4.9 of the 
extension ext(£^) there exist projections p,q & V^, such that n£{p) G Sp and a G Eg. 
II p then s{p) q follows from Lemma 2.27 since q G . We get 



and the non-negativity of the relative entropy proves the claim. 

Let p ^ q. Then we have s(p) :< p = s{7i£{p)) :< q = s{a) and only finite relative 
entropies appear in the claimed equation. We subtract the trivial equation 



S{p,T^£{p)) < S{p,a). 



S{TT£{p),a) = S{p,a) 



= oo 



S{n£{p),n£{p)) + S{ns{p),a) = S{n£{p),a) 
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and continue to show that the resulting difference 

X := S{p,7i£{p)) - S{p,a) -[S{7i£{p),7r£{p)) - S{7r£{p),a)] 
= tr[p-7r£(p)][log[^la-logW7r£(p)] 

is zero. By Definition 4.9 of resp. £q there exist 9,9 E and y, y e R, such that 
logM 7rf:(p) = 0^(9) + yp resp. log'*' a — c'^{9) + yq. Since -< s{'Ks{p)) — P d: q 
this gives 

X = tT[p-ne{p)][c''{e)-cP{e)] = tT[p-ne{p)][9-9]. 

By Definition 4.11 of the projection tt^ the difference p — i^e^p) is perpendicular to 
t/, hence a; = 0. □ 

We maximize the von Neumann entropy under linear constraints. The invertible 
solutions for p = \ are well-known, see e.g. p. 125 in [1097]. The non-invertible 
solutions depend on a projection lattice P^, where the linear space U imposes the 
linear constraints. The lattice can in principle be computed by spectral analysis, 
see Remark 2.26. We formulate the result in terms of the mean value map 
and the convex support cs(iti, . . . ,itfc), defined in (5) and (6). 

Theorem 4.26. 1. Let U C be a linear suhspace of self-adjoint matrices. We 
consider the exponential family £ :— Ra{U). For every u in the mean value set 
M.A.{U) the equality 

argmax{5(p) \ p e {u + U^) H Sa} = Mcv\£))~\u) (61) 

holds and {'^u\ci'^^(£))~^{u) has cardinality one. In particular, the set of solutions 
to (61) is the rl- closure of £. 

2. Let U :— span]g(Mi, . . . , Uk) for some Ui, . . . ,Uk G o,iT'd let ^ = (^i, . . . e 
csa{ui, . . . ,Uk) belong to the convex support. Then 

argmax{5'(p) \ p e Sa such that m„i,...,u^(p) = ^} (62) 

has cardinality one. There exists a unique projection p e and there exist real 
numbers f3i, . . . , f^k such that 

(4,...,4)F,^,(c^'(-EliA«.)) = -e (63) 

holds for the free energy FpAp on the algebra pAp. If p{0 denotes the maximizer 
in (62) then each solution [p, (/3i, . . . ,/3a;)] of (63) satisfies 

p(0 = RvAAcn-Y!l=iPiUi)) (64) 

and its von Neumann entropy is S{p{^)) = FpAp{cP{— Yl!i=i Pi'^i)) + Yli=i l^iii- 
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Proof: There is a bijection TTuld^^f^g) ■ cV {£) — )■ M^(C/) by Theorem 4.15. Hence 

we can consider the inverse {ttuIcvi^s))'^ ■ ^AiU) cY\£). Hue M{U) then 
in order to prove (61), it is sufficient to show for all p e (u + U^) fl with 
P 7^ Mcv\£))~\u), that 

^(KIcfh^))"H")) > s{p). 

Since 7Tu{p) — u we have {'^u\cI'^t^[£))~^{u) — '^sip)- Since ^ £ ^ = Ra{U), Theo- 
rem 4.25 shows 

Sip,neip)) + Sineip),^) = S{p,^). 
Hence S{a, ^) = logtr 1 — S{a) for a e 5^ shows 

S(p) = S(7re(p))-S(p,7r£(p)). 

The state Tr£{p) is the maximizer since the non-negative relative entropy S{p, 7Te{p)) 
is only zero for p — tts{p)- It is clear from the construction that the set of maximizers 
is the rl-closure cY^{£). 

The coordinate formulation (62) follows from Remark 1.2.5 stating ...^^^ = 
'mui,...,uk o and that m„^,...,„jM(i/) : M{U) cs(mi, ...,«/,) is a bijection. 

The differential condition (63) on the coefficients [p, (/3i, . . . , /3k)] of the maxi- 
mizer p :— p{^) and its exponential expression (64) are proved in Corollary 4.16 (up 
to the sign). Let p — RpAp{c^{— Yli=i A'^i))- Then we have 

-tTp[d\-J2Lll^i'^i)] = J2i=l l^ii'^i^ P) = ELlA^i- 

The von Neumann entropy S{p) — — trplog(p) simplifies to 

S{p) = -trp[c^'(-EliA«.)-i^p^p(cn-EliA«0)] 

by an expansion of the exponential expression (64) under the logarithm log^^. □ 

5 Comment on the representation 

We now consider an arbitrary finite-dimensional C*-algebra A. We show that the 
I-/rI-topology is independence of the representation. In order that exponential fam- 
ilies can be defined in A we have to use a representation of .4. as a C*-subalgebra 
of Mat(n,C) but the complete projection theorem and Pythagorean theorem are 
independent of this choice. The maximization of the von Neumann entropy does 
depend on the representation of ^ as a C*-subalgebra of Mat(n, C). 

The first object, the relative entropy, is monotone under C*-morphisms 

$ : i3 -> ^ 

between two unital C*-aIgcbras [Uh77], i.e. if f,g : A ^ C are two states and 
if $*(/) := / o then 5($* (/),$* (5/)) < S{f,g) holds. U ^ : B ^ A is a 
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C*-isomorphism then provides the opposite inequahty and (/),$* ((7)) = 
S{f,g) follows. In particular, our results about the I-/rI-topology in §3.2 are valid 
in any finite-dimensional C*-algebra independent of the representation. 

The second object, an exponential family, can be defined if the finite-dimensional 
C*-algebra A is represented by an algebra of linear operators on a Hilbert space. The 
normal state space of A consists of all positive and normalized trace class operators 
p in A, with associated functional (see Theorem 2.4.21 in [Br87], also Remark 1.11) 

a !->■ (a, p) — tr(ap) , {a E A) . 

Elements p in an exponential family are defined by functional calculus but the 
normalization condition restricts the possible representations of A. E.g. the one- 
dimensional algebra C represented as {x — (xj)jgN £ | 3A e C, Vi e N : = A} 
has no normal state. 

Every finite-dimensional C*-subalgebra is, according to Theorem III. 1.1 in [Da96], 
C*-isomorphic to the direct sum 

^ etiMat(A;,,C), (65) 

where N e No and k e is a multi-index. Up to unitary equivalence, any C*- 
algebra A of linear operators which is C*-isomorphic to B, has the form 

^ ■= ©ti{07=:i ai\aiE Mat{ki, C)}®Oi 

with cardinalities rrii > 1 for i = 1, . . . , N and / > 0, see Corollary III. 2.1 in [Da96], 
and there is a C*-isomorphism ^ : B ^ A 

^(eti bi) = ell ®T=i ® 0' ' (^1' ■■■MeB. (66) 

Only matrices that vanish on each summand of infinite multiplicity can be nor- 
malized so exponential families can only be defined in an algebra A with kiuii + 
• • • + k^mi^ < 00. Hence we shall assume A C Mat(n, C) for n G N in the following. 

Still, we arc interested how different representations of ^ as a C*-subalgebra of 
Mat(n, C) affect our results about exponential families. Let us begin with mean 
values. We had preferred the mean value set (4) to the isomorphic convex support 
support (6) because of the simplicity of the Hilbert-Schmidt Euchdean geometry, see 
Remark 1.2.5. Convex support has other advantages. Firstly, it is equivariant under 
the isomorphism (66): f{b) = ($~^)*/($(6)) holds for all states f on B and b E B. 
The mean value set is not equivariant. Another advantage of the convex support is 
that the decomposition of Lemma 2.23 becomes a simple inclusion csp^p C cs^: 

Lemma 5.1. Letui,...,Uk € ^sa? P'^t U := s])ein^{ui, . . . ,Uk) and let p e . 
Then the following diagram commutes. 
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Proof: We extend the second diagram in Lemma 2.23. By Remark 1.2.5 we have 

"^«i,...,«fc o 7r(7 = mu^^...,uk and m„j,...,„jM((7) : M{U) cs^(mi, . . . ,Uk) is a bijection. 
In the algebra pAp this means mcP{ui),...,cP{uk) o t^cp(u) = mcP{ui),...,cP{uk) and 

mcP{ui),...,cP{uk)\M(cP(U)) ■ M{cP{U)) CSpAp{cP{ui), . . . , cP{uk)) 

is a bijection. For all a e pAp and i = 1, . . . , A; we have 

{ui,a) = {ui,pap) = {d'{ui),a) 

hence mu^,...,„^(a) = ?«cp(ui),...,cp(ufe)('^) holds and completes the proof. □ 

Next we consider exponential famihes. The adjoint : ^* — > B* of (66) is given 
for Fi e Mat(A;j, C), i = 1, . . . , A^, by 

**(ef=i e;::i © oo = ei, m,F, . (6?) 

Lemma 5.2. Lei C i3sa be a non-empty affine subspace and S := Rts{Q)- Let 
^0 '■= ©ill ln(mi)lfc. e fisa- Then the affine space 6 := $ (6 — 6*0) C satisfies 

($*)-H^) = i?^(e). 

Proof: By (67) we have ^*oRo ^6) ^ R(e + 0^^^ ln(mi)lfe.) for 6 E 0. □ 

Lemma 5.2 shows that the class of exponential families is preserved under the 
isomorphism (66). Clearly $*(p + [/^) = $*(p) + ^^^{U)^ holds for all p E and 
U C ^sa- We conclude that the complete Pythagorean theorem and the complete 
projection theorem (Theorem 4.15 and Theorem 4.25) are valid for finite-dimensional 
C*-algebras independent of the representation. 

We finish with the negative result that the von Neumann entropy maximization 
in Theorem 4.26 is not equivariant under the isomorphism (66). The reason is that 
in Lemma 5.2 is not necessarily a linear space even though is a linear space. 

Example 5.3. The unconstrained maximum of the von Neumann entropy in the 
algebra B = CMs log(2) while for n > 3 and >1 = C © {(A, . . . , A) e C"-^ | A e C} 
the maximum is log(n). The second maximum is assumed uniquely at (^, . . . , ^) 
and $*(^, . . . , ^) = (^, ^) has the von Neumann entropy 

= log(2)-ilog[(fci))V(n-l)] < log(2) 

by the BernouUi inequality. Moreover S{^, ^^) '^-^ holds by the continuity of 
the von Neumann entropy. 
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